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Abstract 

A search  algorithm  for  the  solution  of  the  fault  diagnosis 
equations  arising  in  linear  time  invariant  analog  circuits  and  systems 
is  presented.  By  exploitation  of  Householder's  formula  an  efficient 
algorithm  whose  computational  complexity  is  a function  of  the  number 
of  system  failures  rather  than  the  number  of  system  components  is 
obtained . 

Introduction 

Conceptually,  the  fault  analysis  problem  for  an  analog  circuit 
or  system  amounts  to  the  measurement  of  a set  of  externally  access- 
ible parameters  of  the  system  from  which  one  desires  to  determine  the 
internal  system  parameters  or  equivalently1  locate  the  failed  com- 
ponents as  illustrated  in  Figure  1.  Here,  the 


measurements,  nu,  may  represent  data  taken  at  distinct  test  points  or 
alternatively,  data  taken  at  a fixed  test  point  under  different  stimuli. 

1Since  the  problem  of  determining  the  values  of  the  failed  ccrponents  is  usually 
straightforward , once  the  failures  have  been  located,  the  exact  determination  of 
all  internal  ccnponent  parameters  is  essentially  equivalent  to  the  problem  of 
"simply"  locating  the  failed  ccrponents. 


J 
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Similarly,  the  represent  parameters  characterizing  the  various 
internal  system  components.  Here,  a single  parameter  may  character- 
ize an  entire  component,  say  a resistance,  capacitance  or  inductance. 
Alternatively,  a component  may  be  represented  by  several  parameters: 
the  h-parameters  of  a transistor,  the  poles  and  gain  of  an  op-amp, 
etc.  In  general,  one  models  a system  component  by  the  minimum  number 
of  parameters  which  will  allow  the  failure  to  be  isolated  up  to  a 
"shop  replaceable  assembly"  with  all  "allowed"  system  failures  mani- 
festing themselves  in  the  form  of  some  parameter  change. 

To  solve  the  fault  diagnosis  problem,  one  then  measures 
m = col(nu)  and  solves  a nonlinear  algebraic  equation 

1 . m = F (r ) 

for  r = col(r^)  to  diagnose  the  fault.  For  linear  time-invariant 
systems  the  function  F can  be  expressed  analytically,  as  we  shall  see 
in  the  following  section.  More  generally,  in  the  nonlinear  case,  one 
can  evaluate  F(r)  for  any  given  parameter  vector,  r,  with  a simulator, 
and  thus  solve  1.  numerically,  even  though  F has  no  analytic  expression. 

Although  one  does  not  usually  formulate  the  fault  diagnosis  pro- 
blem  in  terms  of  the  above  described  equation  solving  notation,  this 

9 

formulation  is  equivalent  to  the  classical  fault  simulation  concept. 

Indeed,  fault  simulation  is  simply  a searc  i algorithm  for  solving  1. 

Here,  one  precomputes  m = f(r)  for  each  allowable1  faulty  parameter 
vector  r and  then  compares  the  measured  m with  the  simulated  m's,  stored 
in  a fault  dictionary,  to  solve  equation  1. 
l 

By  allowable  faults  we  mean  all  possible  parameter  vectors,  r,  which  satisfy  a 
specified  set  of  fault  hypotheses.  These  typically  restrict  the  maximum  number  of 
component  parameters  which  are  simultaneously  out  of  tolerence  and  the  type  of 
failure  (open  circuit,  short  circuit,  small  change,  etc.) 
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Although  the  above  described  approach  to  fault  simulation  has 
been  successful1  when  applied  to  digital  system,  there  is  considerable 
question  surrounding  its  applicability  to  analog  circuits  and  systems^-. 

The  problem  here  is  two-fold.  First,  rather  than  simply  failing  as 
a one  or  zero,  an  analog  parameter  has  a continuum  of  possible  fail- 
ures. Secondly,  unlike  a digital  system  wherein  a component  is  either 
good  or  bad,  in  an  analog  system,  a component  parameter  is  either  in 
tolerence  or  out  of  tolerence.  As  such,  for  each  hypothesized  failure, 
it  may  prove  necessary  to  do  an  entire  family  of  Monte  Carlo  simu- 
lations in  which  the  values  of  the  good  components  are  randomly  chosen 
within  their  tolerence  limits.  Although,  at  the  present  time  we  have 
insufficient  practical  experience  to  determine  the  precise  number  of 
fault  simulations  required  for  analog  fault  diagnosis,  it  is  estimated 
that  the  number  of  simulations  required  for  an  analog  system  will  ex- 
ceed the  number  of  simulations  required  for  a digital  system  of  similar 
complexity  by  a factor  ranging  between  two  and  six  order  of  magnitude.^" 

As  such,  the  fault  simulation  concept  which  has  proven  to  be  so 
successful  for  a digital  system  may  not  be  applicable  in  the  analog  case. 

As  an  alternative  to  fault  simulation,  one  may  adopt  one  of  the 

2 3 

more  classical  equation  solving  algorthms  for  the  solution  of  1.  ' 

Here,  one  first  measures  m and  on  the  basis  of  this  measurement,  makes 
an  initial  guess  r°  (usually  taken  to  be  nominal  parameter  vector) 
at  the  solution  of  the  equations.  One  then  evaluates  m°  = F(r°)  and 
compares  it  with  m.  If  m°  = m,  r°  is  the  solution  to  the  fault  diag- 

| 

1 ... 

Most  industrial  users  of  ATE  obtain  satisfactory  fault  detection  in  digital 

circuits  via  fault  simulation  techniques  but  require  guided  probe  techniques 

in  addition  to  the  fault  dictionary  data  for  fault  diagnosis  (isolation) . 

= , J 
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nosis  equation.  If  not,  one  makes  a new  "educated"  guess  at  the 
solution,  r\  (usually  based  on  the  deviation  between  m and  m°)  and 
repeats  the  process  by  evaluating  m1  = Ffr1)  and  comparing  it  with  m. 
Hopefully,  sequence  of  component  parameter  vectors,  r1,  and  simulated 
data  vectors,  m1  = Ffr1),  is  obtained  which  "quickly"  converges  to 
r and  m,  respectively.  Since  the  evaluation  of  Fir1)  is  essentially 
equivalent  to  the  simulation  of  the  system  with  the  faulty  parameter 
values,  r1 , this  technique  is  really  another  form  of  fault  simulation. 
In  this  case,  however,  one  simulates  the  system  after  the  data  vector 
has  been  measured  and  uses  this  data  to  make  an  educated  guess  at  a 
(hopefully)  small  number  of  parameter  vectors  at  which  the  system 
should  be  simulated.  As  such,  the  approach  has  been  termed  simulation 
after  test^  to  distinguish  it  from  the  classical  approach  wherein  all 
simulation  is  done  before  test.'*' 

At  the  time  of  this  writing,  both  approaches  are  under  study'*', 
neither  of  which  have  been  shown  to  be  superior.  Fault  "simulation 
after  test"  requires  that  one  include  an  efficient  simulator  in  the  ATE 
itself,  which  can  be  used  for  on-line  computation  of  m1  = F(r^)  after 
the  UUT  has  been  measured.  On  the  other  hand,  simulation  after  test 
eliminates  the  requirement  of  searching  a large  fault  dictionary  for 
the  (approximate)  data  matches  required  by  "simulation  before  test."  In 
addition,  the  complex  ATPG  requirement  for  "simulation  before  test"  is 
eliminated. 

To  make  "simulation  after  test"  feasible,  however,  an  efficient 
equation  solving  algorithm  is  required  to  obtain  convergence  of  the  r1 
sequence  in  a reasonable  amount  of  time.  Moreover,  since  "real  world" 
failures  in  analog  circuits  and  systems  often  take  the  form  of  open  and 
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short  circuited  components  or  large  parameter  deviations  from  nominal 
the  classical  perturbational  algorithms  a-la  Newton- Raphson  are  in- 
applicable. Fortunately,  in  the  context  of  the  fault  diagnosis  pro- 
blem, one  can  reasonable  assume  that  relatively  few  component  para- 
meters have  failed.  As  such,  even  though  it  is  not  valid  to  assume 
that  r-r°  (the  deviation  of  r from  nominal)  is  small  in  norm,  it  is 
reasonable  to  assume  that  it  is  small  in  "rank."  The  purpose  of  the 
present  paper  is  to  formulate  a search  algorithm  for  the  solution  of 
the  fault  diagnosis  equations  which  exploits  such  an  assumption. 

In  the  following  section,  the  explicit  form  for  the  fault  diagnosis 

equations  arising  in  linear  time-invariant  circuits  and  systems  is 
3 4 

derived.  Householder's  formula  is  then  used  to  exploit  the  special 
form  of  these  equations  in  combination  with  an  assumption  that  r differs 
from  r°  in  relatively  few  coordinants  to  formulate  a search  algorithm 
for  the  solution  of  the  fault  diagnosis  equations  in  which  the  compu- 
tational complexity  of  the  simulation  process  is  a function  of  the 
number  of  the  failures  rather  than  the  number  of  components.  This  algor- 
ithm is  based  on  a similar  algorithm  suggested  by  Temes5  for  "simulation 
before  test"  and  a large  - change  sensitivity  algorithm  first  given  by  Leung 
and  Spence.®  Finally,  examples  of  the  application  of  the  algorithm  to 
active  and  passive  circuits  are  presented  and  a study  of  the  robustness 
of  the  algorithm  to  deviations  of  the  "good"  components  from  their 
nominal  values  is  presented. ^ 

Explicit  Form  of  the  Fault  Diagnosis  Equations 

In  the  case  of  a linear  time-invariant  circuit  or  system,  the  fault 
diagnosis  equations  discussed  abstractly  in  the  previous  section,  may 
be  expressed  explicitally  in  analytical  form.  Indeed,  it  is  the  ex- 


plicit  nature  of  this  form  which  makes  our  simplified  solution  algor- 
ithm possible.  Since  the  fault  diagnosis  equations  deal  with  the 
relationship  between  the  externally  measureable  system  parameters,  m, 
and  the  internal  component  parameters,  r,  we  adopt  a "component  connect- 
ion model"  as  the  starting  point  for  the  derivation  of  the  fault  diag- 

g 

nosis  equations.  This  is  one  of  several  commonly  employed  large 
scale  system  models  in  which  the  components  and  connections  in  a circuit 
or  system  are  modeled  by  distinct  equations,  thereby  permitting  one 
to  explicitally  deal  with  the  relationship  between  the  individual  com- 
ponent parameters  and  the  composite  system  parameters. 

Since  the  present  study  is  resticted  to  linear  time-invariant 
systems,  we  assume  that  each  component  is  characterized  by  a transfer 
function  matrix  which  is  dependent  on  the  potentially  variable  com- 
ponent parameters,  Z^(s,r).  For  the  classical  RLC  components  Z.(s,r) 
may  take  the  form  R,  Ls,  or  1/sC  for  the  case  of  a resistor,  inductor, 
or  capacitor,  respectively.  More  generally,  one  may  model  an  op-amp 
by  the  transfer  function  k/ts-pi) (S-P2)  where  the  parameter  vector,  r, 

now  represents  the  three  potentially  variable  component  parameters;  k, 

sT  • 

Pi , and  p2 ; or  a delay  by  ke  , etc.  Although  the  symbol  Z is  used 
(for  historical  reasons) , the  components  are  not  assumed  to  be  repre- 
sented by  impedance  matrices.  Indeed,  hybrid  models  are  used  in  most 
of  our  examples.  For  the  purpose  of  analysis,  it  is  assumed  that  all 
faults  manifest  themselves  in  the  form  of  changes,  possibly  catastophic, 
in  the  parameter  vector,  r,  with  the  frequency  characteristics  of  the 
components  unchanged.  Although  not  universal,  this  fault  hypothesis 
covers  the  most  commonly  encountered  situations  and  subsumes  the  common 
industrial  practice  of  assumming  that  all  failures  in  analog  circuits 
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and  systems  take  the  form  of  open  and  short  circuited  components. 

Our  system  components  are  thus  characterized  by  a set  of  simul- 
taneous equations 

2.  bi  = Zi(s,r)ai  i = 1,2,  ...,  n 

where  a^  and  b^  denote  the  component  input  and  output  vectors,  re- 
spectively. For  notationa.l  brevity,  these  component  equations  may  be 
combined  into  a single  block  diagonal  matrix  equation 

3.  b = Z (s , r)  a 

where  b = col(b^),  a = col(a^)  and  Z(s,r)  = diag (Z^ (s ,r) ) . Although 
3.  is  written  as  a single  equation,  it  is  important  to  remember  that  it 
represents  a set  of  decoupled,  simultaneous  equations,  in  which  Z(s,r) 
is  block  diagonal.  Indeed,  we  will  exploit  this  fact  in  the  application 
of  Householder's  formula. 

Although  there  are  many  ways  to  represent  the  connection  in  a 
circuit  or  system;  say  a block  diagram,  linear  graph  or  signal  flow 
graph,  any  such  representation  is  simply  a graphical  means  for  dis- 
playing a set  of  connection  equations:  Kirchoff  laws,  adder  equations, 
etc.  As  such,  for  our  component  connection  model  we  adopt®  a purely 
algebraic  connection  model  in  which  the  connection  equations  are  dis- 
played explicitally  without  the  intermediary  of  some  kind  of  graphical 
connection  diagram.  This  takes  the  form 


4. 

a - L^b  + L^2 

y = I,21b  + l22 

where  u and  y represent  the  vectors  of  accessible  inputs  and  outputs 
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which  are  available  to  the  test  system.  In  simple  systems,  the 

connection  matrices,  L are  usually  obtainable  by  inspection, 

whereas,  in  more  complex  systems,  computer  codes  have  been  developed 

for  their  derivation.^  Moreover,  they  are  assured  to  exist  in  all 

0 

but  the  most  pathalogical  systems. 

It  is  the  pair  of  simultaneous  matrix  equations  3 and  4 which 
are  termed  the  component  connection  model.  By  combining  equations 

3 and  4 to  eliminate  the  component  input  and  output  variables,  a 

3 8 

and  b,  one  may  derive  ' an  expression  for  the  transfer  function 
matrix  observable  by  the  test  system  between  the  test  input  and 
output  vectors,  u and  y,  obtaining 

5.  S(s,r)  = L22  + L21(l  - Z(s,r)Ln)-1Z(s,r)L12 

where 

6 . y = S (s  , r)  u 

For  a linear  time-invariant  system  the  transfer  function  S(s,r) 
is  a complete  description  of  the  measurable  data  about  the  UUT 
available  to  the  test  system.  Moreover,  being  rational  it  is  com- 
pletely determined  by  its  value  at  a finite  number  of  frequencies. 

As  such,  without  loss  of  generality  we  may  take  our  vector  of 
measured  data  to  be  of  the  form 

7.  m = col[S  ^ , r)  , S(s2,r),  ...,  S(sk,r)] 


The  fault  diagnosis  equations  then  take  the  form 
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S(Sl,r) 


^22^^21  (s^  ,r)  L^)  Z(s^,r)L^2 


s (s2  ,r) 


L22+L21 ^1~Z ^S2 ' Lll^  Z(s2,r)L12 


8 m = 


= F (r ) 


S (sk,r) 


L22+L21 *1_Z (sk'r^ Lll*  Z^sk'r^L12 


In  the  present  context  we  will  assume  that  s^,  s2,  ...»  s^  represent 
sufficiently  many  frequencies  to  permit  the  fault  diagnosis  equations  to 
be  solved.  Indeed,  algorithms  for  determining  such  a set  of  frequencies 
when  they  exist  are  given  in  references  10,11,  and  12.  The  problem 
at  hand  is  the  development  of  an  efficient  algorithm  for  the  solution 
of  these  fault  diagnosis  equations. 

Householder's  Formula  and  the  Search  Algorithm 

Given  the  explicit  form  of  fault  diagnosis  equations  of  8,  it  is 
apparent  that  the  vast  majority  of  the  computation  required  for  the 
simulation  of  F(r),  either  before  or  after  test,  is  the  inversion 
of  the  family  of  matrices;  (1  - Zfs^rjL^),  i = 1,2,  ...,  k.  For- 
tunately, given  the  assumption  that  relatively  few  components  have 
failed,  i.e.  that  r differs  from  its  nominal  value,  r°,  in  only  a 

4 

small  number  of  coordinents,  Householder's  formula  may  be  invoked 
to  compute  (1-Z  (s^,r)  L^)  ^ in  terms  of  (1  - Z (s^ , r°)  L^)  together 

with  the  inversion  of  a small  dimensional  matrix.  More  precisely, 
if  A,  B,  C,  and  D are  given  matrices  of  dimension  nxn,  nxn,  nxp, 
and  pxn,  respectively,  where 

9.  A » B + CD 


then 
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10. 


A 1 = [1  - B_1C(1  + DB  LC)  1D]B~1 


As  such,  once  B ^ is  known,  one  may  compute  the  inverse  of  the  nxn 
matrix,  A,  in  terms  of  B ^ and  the  inverse  of  the  pxp  matrix 
(1  + DB_ ) . This  technique  has  been  used  effectively  for  large 
change  sensitivity  analysis*’  and  has  recently  been  suggested  by 
Temes  for  application  to  fault  simulation.'’  This  is  achieved  by 
exploiting  the  block  diagonal  character  of  Z(s,r).  Thus  if  r 
differs  from  r°  in  q coordinents  Z(s,r)  will  differ  from  Z(s,r°) 
only  in  the  pxp  block  composed  of  components  which  are  effected  by 
the  faulty  parameters1.  If  the  rows  and  columns  of  Z(s,r)  are  re- 
ordered so  that  this  block  appears  in  the  upper  left  corner  of 
Z(s,r)  then. 


11. 


Z (s^r) 


Z ( s^ , r° ) 


+ 


where  A is  pxp  and  Z(s,r)  is  nxn.  We  then  have 
12.  (1  - Z(si,r)L11)  = (L  - Z(si,r°)L11) 


+ 


-A (si ,r) 
0 


where  denotes  the  upper  (after  reordering)  p rows  of  L.^. 

Finally,  an  application  of  Householder's  formula  yields 


Here,  p is  the  sum  of  the  dimensions  of  all  the  blocks  of  Z(s,r)  which  are 
dependent  on  the  q coordinents  in  which  r differs  from  r°.  Typically, 
q . p with  the  exact  relationship  depending  the  block  sizes. 


(1-Z (si,r)L11) 


l-(l-Z(si,r°)L11)-1  "Msi'r) 


l+LP1(l-Z(si,r°)L11)"1  Ztllilll 


LU  (l-Z(si,r0)L11)_1 


Although  quite  complex,  the  only  matrix  major  computation  required  for  the 
inversion  of  (1  - Z(s^,r)L^)  via  13  is  the  inversion  of  the  pxp 
matrix  in  parentheses.  As  such,  as  long  as  the  number  of  faulty  parameter 
values  remains  small,  equation  13  represents  an  extremely  efficient 
means  of  carrying  out  a large  number  of  fault  simulations  with 
relatively  little  computational  capacity.  Although  Temes  originally 
suggested  the  technique  in  the  context  of  a "simulation  before  test" 
algorithm,  the  above  application  of  Householder's  formula  is  ideally 
suited  for  "simulation  after  test",  wherein,  it  reduces  the  compu- 
tational requirements  for  the  simulation  process  to  well  within 
the  capabilities  of  the  minicomputers  usually  found  in  modern  ATE. 

Although  Householder's  formula  yields  an  efficeint  means  for 
solving  the  fault  diagnosis  equations  once  the  faulty  parameters 
have  been  determined,  it  remains  to  locate  the  set  of  fault  para- 
meters. Fortunately,  the  efficiency  of  the  solution  algorithm  based 
on  Householder's  formula  is  such  that  one  can  justify  a search 
through  "all"  allowable  sets  of  faulty  parameters  to  locate  the 
actual  failures.  Indeed,  if  we  denote  the  "reduced  fault  diagnosis 
equations"  in  which  all  component  values  are  assumed  to  be  nominal 
except  for  q specified  parameters;  r^(2)'  •••'  ri(q); 


14 


1 

Fi(l)'  i (2) i (q ) then  the  equation 

14'  m = Fi  ( 1)  , M2),  ...  i(q)(ri(l),  ri(2),  •••  ri(q)} 

will  have  a solution  if  and  only  if  the  faulty  parameter  values 
are  among  the  r^^,  ri(2)'  **•  ri(q)"  As  suc^»  if  one  attempts 
to  solve  14  for  each  allowable  family  of  faulty  parameters,  the 
actual  fault  will  be  indicated  by  the  existence  of  a solution  to 
the  equation. 

Although  such  a search  algorithm  might  at  first  seem  to  be 
highly  inefficient,  when  one  observes  that  with  the  aide  of  House- 
holder's formula,  the  evaluation  of  F.  ...  requires 

1(1),  1(2),  ...,  l(q) 

only  the  inversion  of  pxp  (p  ~ q)  matrix  it  is  seen  that  this  is 
not  the  case.  Moreover,  if  one  searches  for  the  most  likely  fail- 
ures first,  relatively  few  equations  need  be  solved  in  practice. 

In  actual  implementation  in  a "simulation  after  test"  algorithm, 
one  can  readily  search  through  all  possible  combinations  of  one, 
two,  or  three  simultaneous  failures,  and  commonly  encountered  com- 
binations of  larger  numbers  of  failures,  thus  locating  the  far 
majority  of  failures  in  a reasonable  amount  of  ATE  time. 

An  alternative  formulation  of  the  search  algorithm  which  allev- 
iates the  numerical  difficulties  associated  with  the  attempt  to 
solve  a set  of  equations  which  may  not  have  a solution  (as  is  the 
case  whenever  one  attempts  to  solve  14  with  the  wrong  choice  of 
faulty  parameters)  is  to  employ  an  optimization  algorithm,  rather  than 
an  equations  solver,  to  minimize 
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15.  Ji(l),  1(2),  ...  i (q ) (ri  (1 ) f ri  (2)  ' •**  ri(q) 


m - F . 

i 


(1),  1(2),  ...  i(q){ri(l)'  ri(2)'  •••  ri(q)  I ^ 


Since  15. has  a zero  minimum  if  and  only  if  14. has  a solution  a search 
through  the  minimization  of  15. for  all  allowable  sets  of  faulty 
parameters  will  also  locate  the  faulty  parameters  (indicated  by  a 
zero  minimum) . 

Examples 

As  a first  example,  consider  the  LC  filter  shown  in  Figure  2. 
for  which 


Figure  2.  LC  Filter. 


the  component  connection  model  takes  the  form 


Since  we  assume  that  the  source  and  load  resistors  are  external  to 
the  filter  and  do  not  fail  they  have  been  imbedded  into  the 
connection  equations  and  thus  do  not  appear  explicitally  as  com- 
ponents. The  filter  components  are  assumed  to  have  the  nominal 
values 

18.  Cx  = 10,  L1  = 20,  C2  = 30,  and  L2  = 40 

and  it  is  assumed  that  no  more  than  one  component  fails  at  a time 
(though  the  failure  may  be  catastrophic) . Our  "simulation  after 
test"  fault  diagnosis  algorithm  then  requires  that  we  minimize 
J2 ( ) > J3^C2^'  and  The  Performance  measure 

with  zero  minimum  then  represents  the  failed  component  with  the 
minimizing  value  for  that  performance  measure  representing  the 
value  of  the  failed  component.  All  other  component  values  must 
then  be  nominal  (since  it  is  assumed  that  only  one  component 

_ . .. J 
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fails).  Note:  the  minimizing  value  for  the  non-zero  J^'s  does 
not  correspond  with  the  correct  component  values  for  those  com- 
ponents . 

This  filter  was  simulated  with  each  of  its  four  components 
out  of  tolerence  (by  as  much  as  100  percent)  with  the  search 
algorithm  being  applied  to  the  simulated  data.  Since  only  one 
parameter  is  assumed  to  fail  at  a time  and  Z(s,r)  is  diagonal 
each  of  the  four  required  minimizations  was  carried  out  by  purely 
scalar  operations  using  a Golden  Section  search.  In  all  four 
cases  the  fault  was  correctly  located  with  the  faulty  parameter 
value  being  determined  "exactly."  The  resultant  data  is  summar- 
ized in  Table  1.  Note:  in  each  case  the  minimum  value  for  J. 

l 

for  the  faulty  component  is  at  least  three  orders  of  magnitude 
lower  than  the  minimum  value  J\  for  any  non-faulty  component. 

As  such,  the  failure  is  easily  located  and  one  can  expect  the 
algorithm  to  remain  viable  in  the  face  of  numerical  and  or 
approximation  error. 

As  a more  sophisticated  example,  consider  the  one  stage 

transistor  amplifier  of  Figure  3 and  its  wide  band  equivalent 

circuit  shown  in  Figure  4.  Note  that  the  parallel  resistors,  Ra 

and  R^,  appearing  in  this  model  have  been  lumped  together  into  a 

single  resistance,  R , since  it  is  clearly  impossible  to  dis- 

s 

tinquish  between  failures  in  these  two  components  from  external 


measurements . 


The  component  and  connection  equations  for  this  circuit  are 
given  by  equations  18  and  19  and  the  nominal  values  for  the 
component  parameters  are  taken  to  be 
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As  before,  it  was  assumed  that  no  more  than  one  component  failed 
and  J^(r^)  was  minimized  for  each  of  the  12  component  parameters. 
Once  again  the  failure  was  clearly  located  by  the  smallest 
minima  with  accurate  determination  of  the  faulty  parameter 
value.  Indeed,  in  each  case,  the  minimum  value  of  J\  (r^)  for 
the  faulty  parameter  value  is  at  least  5 orders  of  magnitude 
less  than  the  minima  for  the  remaining  J\(r^).  As  such,  there 
is  no  ambiguity  whatsoever  in  the  determination  of  the  faulty 
component  and  its  value  even  though  the  component  parameters  have 
been  allowed  to  deviate  from  their  nominal  values  by  as  much 
as  500  percent.  The  results  of  our  simulations  are  tabulated 
in  Table  2.  In  these  simulations  it  was  assumed  that  no  faulty 
parameter  exceeded  a value  of  1000  and  hence  the  search  was 
stopped  if  the  minimizing  value  for  J\  (r^)  reached  1000.  This 
was  necessitated  by  the  requirement  that  the  Golden  Section 
search  be  restricted  to  a finite  interval.  Of  course,  the  mini- 
mization algorithm  can  be  easily  modified  to  take  into  account 
infinite  values  of  r^;  i.e.  open  or  short  circuited  components. 
Multiple  Failures 

Although  we  have  not  given  any  numerical  examples  of  the 


J 


21 


application  of  the  search  algorithm  to  the  case  where  multiple 
failures  are  assumed,  the  basic  concept  of  our  algorithm  remains 
valid.  Computationally,  the  simple  one-dimensional  Golden  Section 
search  used  to  minimize  J\(r^),  however,  must  be  replaced  by  a 
multidimensional  optimization  algorithm;  say  steepest  decent. 


conjugate  gradient,  etc;  to  minimize  J\ 
(r . 


i(l),  i (2)  , — , i (q) 

id)  / ri(2)'  **•'  ri(q)'  f°r  '2ac^1  set  °f  1 allowable  failures. 

In  addition,  each  evaluation  of  F.  ,,,  . ...  requires 

i(l),  i(2),  . . . , l (q)  ^ 

the  inversion  of  a pxp  matrix  (p  ; q ) rather  than  the  simple 

scalar  operations  required  in  the  single  fault  case. 

An  alternative  approach  to  the  minimization  of  J. ...  . 

i (1)  , l (2)  , 

• , , which  requires  only  scalar  mathematics  is  to  use  a 

• • • 9 ± \Q) 

cyclical  one-dimensional  search  algorithm  on  the  q parameters 


r i ( 1 ) ' ri (2 ) 


, . . . , 


r . 


Here,  one  minimizes  J. 


i (q ) " ^ i (1)  , i (2)  , ...,  i(q) 

as  a function  of  one  parameter  at  a time  cycling  through  the  q 

parameters  until  a minimum  is  achieved.  Although  such  an 

optimization  algorithm  usually  requires  more  iterations  than  a 

true  multidimensional  optimization,  the  entire  process  can  be 

carried  out  with  scalar  operations.  In  particular,  since  one 

only  varies  a single  parameter  at  a time,  Householder's  formula 


allows  F. 


to  be  evaluated  at  each  interation 


i(l) , i (2) , ...  i (q) 
without  matrix  inversion  while  a simple  Golden  Section  search 

may  be  used  for  each  one-dimensional  minimization. 

Robustness 


Unlike  the  case  of  fault  diagnosis  in  a digital  system  wherein 
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a component  is  unambiguously  good  or  bad, in  an  analog  circuit  or 
system,  a component  parameter  is  either  in  tolerence  or  out  of 
tolerence.  As  such,  any  fault  diagnosis  algorithm  which  makes 
use  of  the  nominal  component  parameters  must  be  tested  for  ro- 
bustness. I.e.  how  effective  is  the  algorithm  at  locating  the 
faulty  component (s)  when  the  good  components  are  not  precisely 
equal  to  their  nominal  values.  As  such,  our  search  algorithm 
for  fault  diagnosis  was  applied  to  the  transistor  amplifier  using 
simulated  measurements  in  which  one  component  was  out  of  tolerence 
(taken  to  be  10  percent)  and  the  remaining  component  parameters 

*7 

were  in  tolerence  but  not  equal  to  their  nominal  values.  Of 
course,  the  nominal  values  are  used  to  define  the  F^  since  the 
actual  value  of  the  good  components  is  unknown.  Not  surprisingly, 
this  results  in  some  ambiguity  in  the  diagnosis  process  since 
J^(r^)  can  never  be  reduced  exactly  to  zero.  Fortunately,  the 
data  of  Table  2 resulting  from  our  "perfect"  simulation,  indicates 
that  we  have  five  orders  of  magnitude  in  which  to  work.  As  such, 
our  simulation  yielded  good  though  not  perfect  results.  In  par- 
ticular, the  algorithm  correctly  located  the  fault  in  71  percent 
of  the  trials  with  an  ambiguity  group  of  one  in  50  percent  of 
these  cases  and  ambiguity  groups  of  two,  three  and  four  in  the  re- 
maining cases.  Here,  the  ambiguity  group  was  taken  to  be  the  set  of 

all  r.  for  which  the  minimum  value  of  J.  was  of  the  smallest  order 
l l 

of  magnitude  achieved  by  any  of  the  performance  measures.  The 
results  of  some  typical  simulations  are  shown  in  Table  3.  Since 
all  of  the  good  components  in  this  simulation  were  taken  to  be 
at  the  limits  of  their  tolerence  interval,  these  results  actually 
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represent  a worst  case  situation.  As  such,  we  believe  that  the 
search  algorithm  will  yield  significantly  better  results  in  a 
"real  world"  situation,  wherein  most  of  the  components  will  have 
near  nominal  values  with  relatively  few  of  the  "good"  component 
parameters  lying  near  their  tolerence  limits. 

Hybrid  Algorithms 

Although  the  terminology  has  only  recently  been  formulated^-, 
most  of  the  algorithms  which  have  been  proposed  over  the  years 
of  the  solution  of  the  fault  analysis  problem  in  analog  circuits 
and  systems  can  naturally  be  catagorized  as  either  "simulation 
before  test"  or  "simulation  after  test"  algorithms.  Although 
the  preceding  development  has  been  presented  in  the  context  of  a 
"simulation  after  test"  algorithm,  any  of  the  techniques,  such  as 
the  application  of  Householder's  formula^,  are  also  applicable  to 
"simulation  before  test"  algorithms.  Indeed,  the  techniques  are 
ideally  suited  to  a hybrid  algorithm.  Here,  one  would  employ  a 
two-pass  diagnostic  algorithm  wherein  the  measured  data  vector, 
m,  is  first  compared  with  pre-simulated  data  stored  in  a fault 
dictionary.  If  the  fault  is  so  located,  the  diagnosis  process  is 
terminated.  If  the  fault  is  not  located  among  those  which  have 
been  presimulated  and  stored  in  the  fault  dictionary,  the  hybrid 
algorithm  will  then  revert  to  a "simulation  after  test"  mode  until 
a sequence  of  parameter  vectors,  r^ , and  simulated  data  vectors, 
rtK  , have  been  computed  which  converge  to  the  solution  of  the 
fault  diagnosis  equations.  At  the  same  time  the  results  of  each 
of  these  "after  test"  simulations  are  stored  in  the  fault  dictionary 
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for  use  in  future  applications  of  the  test  algorithm.  As  such,  a 
fault  dictionary  is  slowly  built  up  which  includes  simulations  of 
those  failures  which  are  most  commonly  encountered  in  actual 
practice.  Such,  a hybrid  algorithm  would  seem  to  achieve  the 
best  of  both  worlds.  Common  faults  would  be  found  quickly  on  the 
first  pass,  yet  the  system  would  still  have  the  "simulation  after 
test"  algorithm  upon  which  to  fall  back  when  encountering  a new 
failure  mode.  Moreover,  AT PG  requirements  would  be  greatly  re- 
duced with  only  the  most  common  faults  (say  open  and  short  circuits, 
single  failures,  etc.),  being  pre-simulated  and  the  remainder  of 
the  fault  dictionary  being  adaptively  generated  by  the  "simulation 
after  test"  algorithm  as  new  fault  modes  are  encountered.  Such  a 
hybrid  scheme  alleviates  the  necessity  of  determining  the  fault 
modes  of  a system  in  advance,  as  required  for  "simulation  before 
test"  while  simultaneously  eliminating  the  duplicate  simulations 
of  common  faults  required  for  "simulation  after  test". 

Conclusions 

Our  purpose  in  the  preceding  has  been  the  formulation  of  a 
class  of  techniques  which  we  believe  can  serve  as  the  basis  of 
an  effective  algorithm  for  fault  diagnosis  in  linear  analog  cir- 
cuits and  systems.  These  techniques  have  proven  to  be  effective 
in  the  situation  where  all  good  component  parameters  are  "near" 
nominal  and  give  promise  of  sufficient  robustness  to  cope  with 
the  "real  world"  situation,  in  which  the  good  component  parameters 
are  in  tolerence  though  not  nominal. 

Although  the  presentation  has  been  formulated  in  the  context 
of  a "simulation  after  test"  algorithm,  the  techniques  presented 
are  also  applicable  to  "simulation  before  test"  and  hybrid  algorithms. 
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Component 

C1 

L1 

C2 

L2 

Actual  Parameter 
Values 

20 

20 

30 

40 

Minimum  Value  for  J. 

l 

1.25X10*12 

5.88x10  8 

8.46xl0*9 

1.96xl0*6 

Minimizing  Component 
Value 

i 

i 

20 

30.66 

39.87 

51.89 

. 1 

Component 

C1 

L1 

C2 

L2  ! 

Actual  Parameter 
Values 

10 

40 

30 

40 

Minimum  Value  for  J. 

i 

1.62x10* ' 

2.10xl0*4 

2. 42x10* ' 

6.94x10"° 

Minimizing  Component 
Value 

28.75 

40 

48.5 

62.2 

, 

Component 

C1 

L1 

C2 

4 

Actual  Parameter 

Value 

10 

20 

50 

1 

40 

Minimum  Value  for  J. 

| 

2.9LxlO*8 

2.73x10* 1 

1.47xl0*13 

6.87X10*6  : 

Minimizing  Component 
lvalue 

30.27 

41.62 

50 

■ 

64.01 

l 

Component 

C1 

h 

C2 

L2 

Actual  Parameter 

Value 

10 

20 

30 

45 

i 

Minimum  Value  for  J. 

i 

3. 93xl0*7 

4. 93xl0*7 

4.13xl0*7 

1.04xl0*13 

1 

Minimizing  Component 

lvalue  14.18  24.46  34.13  45 


Table  1.  Fault  Analysis  for  LC  Filter 
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Actual  Parameter 


Table  2.  Fault  Analysis  for  Transistor  Ampl 


Table  2.  Fault  Analysis  for  Transistor  Amplifier  (Cont: 


Table  3.  Fault  Analysis  of  Transistor  Amplifier  with  10  percent  tolerences 
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ABSTRACT 

An  efficient  computer-aided  root-locus  method  is  described.  The 
approach  is  based  on  the  concept  of  continuation  methods  in  which  the 
solution  of  a parameterized  family  of  algebraic  problems  is  converted 
into  the  solution  of  a differential  equation.  The  root-locus  plot  is 
obtained  in  a systematic  manner  by  numerical  integration.  Singulari- 
ties are  analyzed  and  classified  according  to  the  properties  of  higher 
order  derivaties.  Depdending  on  their  classification,  singular  points 
on  the  root  loci  are  taken  care  of  accordingly. 

INTRODUCTION 

The  root-locus  method  is  one  of  the  most  important  design 
techniques  for  linear  time-invariant  feedback  systems.  In 
addition  to  yielding  frequency  response  information  of  the 
system,  it  also  provides  a powerful  tool  for  solving  prob- 
lems in  the  time  domain.  The  basic  idea  of  the  root-locus 
method  is  to  determine  the  closed-loop  pole  configuration 
as  a function  of  the  gain  from  the  configuration  of  the  open- 
loop  poles  and  zeros.  A great  deal  of  information  is  avail- 
able in  texts  and  literature  on  the  method  for  the  construc- 
tion of  root  loci.  The  graphical  method  using  certain 
elementary  geometric  properties  of  the  locus  is  probably 
the  most  commonly  used  approach  (see  e.g.  [II-  [3]).  Other 
approach  [4]  - [7]  employ  either  analytic  or  semi-analytic 
representations  that  involve  the  use  of  equations  of  the 
loci.  Although  analytic  approaches  enable  one  to  obtain 
accurate  plots  along  with  certain  qualitative  features  of 
the  root  paths,  the  point-to-point  plotting  is  just  a for- 
midable task.  Besides,  investiations  for  higher  order  systems 
are  virtually  impractical. 
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It  is  the  purpose  of  this  paper  to  develop  a computer- 
aided  method  for  plotting  root  loci  in  a systematic  manner. 
The  approach  to  be  presented  in  Section  II  is  based  on  the 
concept  of  continuation  methods  [8]  - [10] . The  basic  idea 
is  to  convert  the  solution  of  a parameterized  family  of 


algebraic  problems  into  the  solution  of  a set  of  associated 
differential  equations.  Section  III  is  concerned  with  the 
existence  and  classification  of  singular  points  on  the  root 
loci.  In  section  IV  the  results  obtained  are  illustrated 
by  means  of  examples. 

The  Root-Locus  Method 


Consider  the  closed-loop  system  shown  in  Fig.  1. 
the  open-loop  transfer  function  be  expressed  by 


G (s) H (s) 


(s-z, 


A K- 


<s-p. 


) (s-z2) 
) (s~ P2) 


(s-z 


m 


(s-p 


n 


) 

) 


Let 


(1)  ! 


where  K is  the  open-loop  gain  and  m < n.  The  closed-loop 
transfer  function  is 


G ( s ) = G ( s ) B ( s) 

1 + G (s)  H (s)  B (s)  + KA  ( s ) 


(2) 


The  root-locus  plot  of  the  closed-loop  transfer  function  T(s) 
is  defined  as  the  locus  of  the  poles  of  T(s)  when  K varies 
from  zero  to  infinity.  This  plot  consists  of  a set,  denoted 
by  of  points  in  the  s-plane  such  that 


g (s , K)  A B (s)  + KA(s)  = 0,  (3) 

i . e.  , 

Z = (s|g(s,K)  = 0 }.  (4) 

Instead  of  solving  the  roots  of  (3)  directly  for  each 


K,  a system  of  two  simultaneous  differential  equations 
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g(s(t),  K ( t ) ) * -g  (s  (t) , K ( t) ) , g (s (0) , K(0))  = 0 

d (5) 

dt  K(t)  " ± K (0)  = Kq 

is  considered  where  S ( 0 ) » SQ  is  a root  of  [3]  correspond- 
ing to  an  initial  gain  Kq,  and  t is  a dummy  variable.  Appli- 
cation of  the  chain-rule  to  (5)  results  in 


ds 

dt 


~{g+  ' s(0) 


= s 


dK 

dt 


3T  - + 1 


(6) 


or  equivalently. 


ds  _ (B  (s ) + KA  (s) ) + A(s) 
dt  B’ (s)  + KA*  (s) 


s(0)  = s. 


dK 

dt 


(7) 


s-r  * + 1. 


K ( 0)  = K 


Equation  (7  ) can  now  be  solved  by  any  numerical  integration 
technique.  For  example,  using  Euler's  Method,  (6)  reduces 
to 


9k+l 


^ b(\j  + + A(sk} 

" h B'  Cs  k)  + KkA*  (sk) 


(8) 


Kk+1  = Kk  i h* 


It  is  seen  from  the  solution  of  ( 5) 

g (s,K)  =*  g ( s (0)  ,K(0)  ) e-t  * 0e_t  = 0 


K = + t 


that  for  any  admissible  pair  Kq  and  sQ  satisfying  (3),  the 
corresponding  trajectory  resulted  from  (5)  will  remain  on 
the  solution  curve  g(s,K)  = 0 as  K changes.  The  + or  - 
sign  is’ chosen  depending  on  whether  one  would  like  to  in- 
crease or  decrease  K.  Since  the  computed  trajectory  may 
not  satisfy  (3)  exactly,  the  minus  sign  in  front  of  g in 
(5)  is  used  to  ensure  that  the  computed  trajectory  does  not 
diverge  away  from  the  locus. 

It  is  a well  known  fact  that  the  root-locus  plot  for 
T(s)  contains  n branches  starting  from  the  open-loop  poles 
at  K = 0.  Therefore,  in  the  case  where  the  open-loop  poles 
are  distinct,  the  n initial  conditions  for  (7)  are  selected 
at  K (0)  = 0 and  s(0)  = P^,  i = l,2,...,n.  In  the  case  where 
the  open-loop  transfer  function  contains  repeated  poles,  the 
term  (3g/3s)  becomes  zero  when  evaluated  at  the  repeated 
poles.  As  a result,  the  selection  of  starting  points  cannot 
be  made  at  K = 0 for  the  repeated  poles.  Approximate  starting 
points,  however,  can  easily  be  obtained  by  analyzing  the 
properties  of  the  root  loci  in  the  neighborhood  of  the  re- 
peated pole.  Suppose  the  open  loop  gain  has  a repeated  pole 
Pk  with  multiplicity  r and  the  corresponding  g(s,K)  has 
the  form 

g(s,K)  = B ( s)  + KA ( s ) 

= (s-p^  (S-P2)  . . . (s-pk)r.  . . (S-Pn_r)  + KA(S)  = 0.  (9) 
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Let 


w ■ pk  +■  As 


+ ee 


j6 


(10) 


where  e is  an  arbitrarily  small  real  number  and  0 is  a phase 
angle  yet  to  be  determined.  Thus  at  s = w 


g(w,K)  » g(pk  + Ee^9/K) » 0 (11) 

Solving  As  from  (11)  , gives 

As  = ee^0  = (Kpe-^)1^  (12) 


where 


- - A(s)(s~p*)r 
B(s) 


S = Pv 


(13) 


Therefore  r approximate  starting  points  in  the  neighborhood 
of  the  repeated  pole  p^  and  the  corresponding  open-loop  gain 
can  be  evaluated  from  (12)  as 


w. 


^ + ee^  r 1 , 


r 3 0,1,2,.. .r-1 


and 

(14) 

K * - er. 

P 

Wit11  th®  proper  choice  of  n starting  points,  the  n branches 
of  the  root-locus  plot  can  be  traced  in  a continuous  manner  by 
numerical  integration.  In  computing  the  root  locus,  care  must 
be  exercised  when  approaching  a singular  point  on  the  locus. 
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Singular  Points 


A point  s*  satisfying 


dK 

as 


s=s* 


. d , B (s)  . 
ds  1 A ( s ) ' 


s=s* 


, _ A (s* ) B 1 (s*)  ~ B ( s* ) A ' (s»)  = 0 

a2(s*) 


(15) 


in  the  complex  plane  is  called  a singular  point.  Since  the 
numerator  of  (15)  is  a (n+m-l)th  order  polynomial  of  real 
coefficients,  there  are  (n+m=l)  singular  points  in  the 
s-plane.  Only  those  singular  points  that  are  located  on  the 
root  loci  will  be  considered.  In  view  of  the  fact  that  A(s*) 
cannot  be  zero  for  a finite  K,  it  follows  that  on  the  root 
locus , the  condition 


A ( s* ) B ' ( S*)  - B ( s* ) A ' ( S* ) = A(S*)(B'(S*)  + KA ' ( S* ) ) = 0 

(16) 

implies 

= B’  (s*)  + KA'  (s*)  = 0.  (17) 

s=s* 

Hence  (7)  is  not  valid  at  s - s*  and  modifications  must  be 
made  to  handle  these  singular  points.  Condition  (15)  does 
include  the  conventional  break-in  and  break-away  points  at 
which  K is  either  a local  maxima  or  a local  minimum  on  the 

J Tf 

real  axis,  respectively.  In  general, 


is  a complex  quantity 
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if  s is  not  located  on  the  real  axis  and  it  does  not  make 

sense  to  talk  about  local  extremal  values  without  proper 

modification.  Now,  since  K is  a real  valued  function  of  s 

for  all  s on  the  root  locus  £.,  the  directional  derivative 
clK 

together  with  its  higher  order  derivatives  along  the 
tangential  direction  of  the  locus  are  well  defined.  It  is 
thus  possible  to  consider  local  extremal  values  of  K along 
l using  the  notion  of  directional  derivatives. 

Let 

K(s)  = -A-ify  £ u<x'y>  + jV(x,y)  (18) 

where  D and  V are  real  valued  functions  of  x and  y,  and 

s = x + jy.  (19) 


The  directional  derivative  of  K at  a point  se£.  in  the  unit 


direction  v 


je 


v + jv  tangential  to  the  root  locus  is 
x y 


dK  . 3U  3U 

ar_  7D(x'r>-v  - ^ vx  + 57  vy  • 


(20  ) 


The  above  equation  can  also  be  written  in  the  form 


k-r®h|2  -j  |2)  <vx  + jvy)  ] , .ci. 


(21) 


Making  use  of  the  Cauchy-Riemann  condition,  (21)  reduces  to 


dK 

d£ 


Re 


.,30 


+ j 


= Re[K ’ (s)ej0l 


(22) 


1 


where  K’  * dK/ds. 


42 


Similarly,  higher  order  directional  derivatives  of 
K with  respect  to  £ are  related  to  the  higher  order  deriva- 
tives by 

^ » R (K'^lsje3"6].  (23) 

dlm  e 

Thus  it  is  seen  from  (23)  that  along  l,  K and  its  directional 
derivatives  are  all  real-valued  functions.  The  following 
theorem  which  plays  an  important  role  in  singularity  classi- 
fication will  be  proved. 

Theorem 

Suppose  p(s)  is  an  analytic  function  such  that 

p(s*)  = a,  for  a real  a^  0, 
p(k) ( s* ) =0 , k = l,2,...,q-l, 
and  (s*)^  0,2  _<  q <_  n 

at  some  point  s*  located  on  Inp(s)  = 0.  Let 

Rp  = ( s | Imp(s)  = 0}  . 

Then  in  the  neighborhood  of  s*,  Rp  consists  of  q branches, 

Rpl'  Rp2  > • • • * Rpq'  and  Rpi  n Rp2  n ---fl  Rpq  - s*.  Furthermore, 
for  each  i,  1 < i < q,  Rep(s) I „ is  either  a local  maximum  or 

- - I Rpi 

a local  minimum  at  s*  if  q is  even;  it  is  either  an  increasing 


function  or  a decreasing  function  if  q is  odd. 
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Proof : 

Without  loss  of  generality  s*  can  be  assumed  to  be 
zero.  Before  considering  the  general  case,  the  theorem 
is  proved  for 

h (s)  » a + sq. 

Identifying  the  set  R^  from 

Rjj  = {S  | Imh(s)  = 0} 

yields 

R^  ■ {re^9|rq  sinqQ  *0} 

■ {re^9|0  » iir/q/  i * 0,1,2, .. .q-1} 


where  6 is  restricted  in  the  upper  half  of  the  s-plane  and 
r assumes  negative  values  in  the  lower  half  plane.  Thus 
Rjj  consists  of  q intersecting  branches  R^,  i * 0,1,2,..., 
q-i.  The  intersection  occurs  at  r a 0.  For  each  i. 


Reh(s) 


*hi 


a + r’costq©^)  = a + rHcos(iit). 


Therefore  if  q is  even,  r^  is  always  nonnegative,  and 


Reh(s) 


> 

< 


when  cos  iir 
a 

when  cos  iir 


1 

-1, 


i.e.,  h ( s > ] is  either  a local  maximum  or  a local  minimum. 
| ni 

Now  if  q is  odd,  then  for  each  i. 
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Reh  (s) 


Hence  (h(s) 


*hi 


*hi 


= a + rMcos  ( iir)  . 


-a)  will  change  sign  either  from  plus  to  minus 


as  r increases  from  negative  value  to  positive  value  or  vice 


versa,  i.e.,  h(s) 


*hi 


is  a monotonic  function. 


Returning  to  the  general  case,  p(s)  can  be  expanded  into 
a Taylor  series  around  s*  as 


p(s)  = p ( s* ) + y c.{s-s*)i 
i=q  1 


= . - (s-s-)q  l C k(»-.*) 


k=0 


Since  the  summation  in  the  above  equation  is  an  analytic 
function  and  has  no  zero  in  a small  disc  around  s*,  from  a 
theorem  in  the  theory  of  complex  variables  (see  e.q.  [11]), 
there  exists  an  analytic  function  u(s)  such  that 

CO 

1 ck+a(s”s*)k  = exp(u(s)). 
k=0  K+q 

Let  v ( s ) = exp  (u(s)/q).  Then 

P ( s ) * a + [(s-s*)v(s) ]q  A a + [f(s)]q 

where  f(s*)  = 0,  f'(s*)  ^ 0.  Thus  f(s)  is  a local  homeomorphism. 
Now 


Rp  - Cs | Im(a  + (f(s))q)  = 0} 


{ s | f (s) e Rh> 
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Rpi  * {s*  f(x)e  Rhi}  ' 1 * 0,lf2,...fq-l. 

If  q is  even,  then  s e Rpi  implies  f(s)  e R^.  It  follows 
that 


Reh(f (s) ) Reh(O)  -R«h(f(s*)), 


when  cos  iir  ■ 1 


when  cos  iir  » -1  , 


_>  when  cos  iir  * 1 

Rep(s)  Rep ( s* ) , 

Rpi  <_  when  cos  iir  * -1. 

Therefore  Rep(s)|  is  either  a local  maximum  or  a local 

Rpi 

minimum  at  s*.  Similarly  if  q is  odd,  then  it  can  be  shown 

that  Rep (s)  R is  either  an  increasing  function  or  a decreasing 

Pi 

function. 

As  a directly  consequence  of  the  above  theorem,  the  following 
corollary  is  deduced. 

Corollary 

Suppose  s*  is  a singular  point  on  l such  that 


K(s*)  ji  0 


0,  k * 1,2,... ,q-l 


? 0,  2 < q < n 


s=s* 
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where  K(s)  = -B(s)/A(s),  then  there  are  q branches  intersecting 
at  s = s*.  Furthermore  if  q is  even,  then  along  each  branch 
of  the  intersecting  root  loci  at  s = s*,  K(s*)  is  either  a 
local  maximum  or  a local  minimum;  otherwise  K(s*)  is  a monotonic 
function  of  s on  that  branch  in  the  neighborhood  of  s*. 

Proof : 

Let  f(s)  = - B(s)/A(s).  Then  f(s)  is  an  analytic  function. 
It  follows  that 

B ( s)  + f(s)A(s)  = 0. 

Comparing  the  above  equation  to  (3) , it  is  obvious  that 
K = Ref (s) 

for  all  s e £. 

0 = Imf(s) 

Application  of  the  above  theorem  to  f(s)  completes  the  proof. 

According  to  the  above  corollary,  singular  points  are 

characterized  by  the  properties  of  higher  order  derivatives. 

It  is  noted  that 

3c  3c  )c 

[~b (s) /a(s)  ] = - 2-SL  /a ( s ) . (24) 

dsK  dsK  3sK 

Since  g(s)  is  an  nth-order  polynomial  with  real  coefficients, 

)c  ]c 

3 g/3s  can  easily  be  generated.  Furthermore,  q can  at  most 


be  equal  to  n since 


p 
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£gCI 

3sn 


n! 

A (s) 


* 0 


(25) 


With  the  above  corollary,  the  conventional  break-in 
and  break-away  points  defined  on  the  real  axis  can  now  be 
generalized  as  follows. 

Definition 

In  the  theorem,  the  singular  point  is  called  an  even 
singular  point  if  q is  even;  otherwise,  it  is  said  to  be  an 
odd  singular  point. 

It  is  clear  from  the  corollary  that  on  the  root  locus 
an  even  singular  point  is  either  a local  maximum  or  a local 
mininum  along  the  branch  as  defined  in  the  Theorem.  The  con- 
ventional break-in  and  break-away  points  are  just  special 
cases  of  even  singular  points  of  order  2(i.e.,  q = 2)  which 
are  located  on  the  real  axis.  In  general,  if  a singular 
point  is  located  off  the  real  axis,  the  concept  of  directional 
derivatives  can  be  used  to  characterize  the  singular  point. 
From  (23 ) and  the  corollary  it  is  obvious  that  at  an  qth-order 
singular  point,  d^/d*!  * 0 for  k = l,2,...,q-l  and  dqK/d£q  ^ 0 
There  are  q branches  of  the  root  loci  intersectir  g at  the 
singular  points. 

In  generating  the  root-locus  plot,  each  branch  is  plotted 
separately  as  K increases.  In  the  neighborhood  of  an  odd 
singular  point,  since  K is  a monotonic  function  of  s on  the 
locus,  the  first  order  directional  derivative  is  a continuous 
function.  Thus  when  approaching  an  odd  singular  point,  it  is 
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necessary  to  jump  over  the  singular  point  by  adding  a small 
variation  j As  | along  the  tangential  direction  of  the  locus. 

For  even  singular  points,  such  procedure  is  invalid.  Since 
on  the  root-locus  plot  an  even  singular  point  is  either 
a local  maximum  or  a local  minimum,  such  construction  does 
not  give  rise  to  an  increasing  K.  Thus  in  order  to  cin- 
tinue  the  plotting  of  the  root  locus  as  a function  of  in- 
creasing K,  it  is  necessary  to  change  the  direction  of  the 
locus  when  an  even  singular  point  is  approached.  Depending 
on  the  order  q of  the  even  singular  point,  Az,  the  change 
in  direction,  is  chosen  as 

Az  = Ase'jll/q  (26) 

where  As  is  a sufficiently  small  vector  in  the  tangential 
direction  of  the  locus  when  approaching  the  singular  point. 

The  factor  e^71^  can  be  viewed  as  a rotational  operator, 
which  rotates  the  direction  clockwise  by  ir/g.  On  a branch 
so  constructed,  the  even  singular  point  no  longer  has  the 
characteristics  of  a local  extremum. 

It  is  apparent  from  the  foregoing  root-locus  construction 
that  the  q branches  will  directly  intersect  each  other  at  an 
odd  singular  point  and  that  the  modified  q root  loci  will  touch 
each  other  at  an  even  singular  point.  For  obvious  reasons, 
an  odd  singular  point  is  known  as  an  intersecting  point  while 
an  even  singular  point  is  called  a touching  point.  The  graphical 
illustrations  for  these  two  types  of  singularities  are  shown 


in  Fig.  2 and  Fig.  3 for  q * 2 and  q = 3,  respectively.  The 
branches  are  numbered  and  the  arrows  are  pointed  in  the  dir- 
ection of  increasing  K. 

After  the  change  of  direction  at  a touching  point  or 
the  jump  over  an  intersecting  point,  it  may  be  necessary  to 
make  corrections  if  the  point  selected  is  not  close  enough 
to  the  locus.  This  can  usually  be  achieved  within  a few 
steps  by  using  the  Newton  iteration 

B (s.  ) + KA(s.) 

sk+l  " sk  ■ B* (Sk)  + KA' (Sk)  * (27) 

The  gain  K which  corresponds  to  the  corrected  point  can  be 
evaluated  either  from 


K = -ReB(s)/ReA(s) 


K * -ImB (s) /1mA (s) . 


Equations  (28)  and  (29)  are  derived  by  taking  the  real  and 
the  imaginary  parts  of  g(s)  - 0,  respectively. 


Examples 

In  this  section  a number  of  examples  are  presented  to 
illustrate  the  proposed  algorithm  for  obtaining  the  root- 
locus  plot.  Although  any  integration  technique  can  be  used 
to  solve  (7),  only  the  Euler's  method  with  variable  step  size 
is  used  for  illustration. 
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Example  1 : 

Consider  a linear  feedback  system  whose  open  loop  transfer 
function  is  given  by 


G (s) H (s) 


(s  + 1.5) (s  + 5.5) 
s(s  + 1) (s  + 5) 


there  are  three  simple  poles  at  0,  -1  and  -5,  and  two  finite 
zeros  at  -1.5  and  -5.5.  Application  of  (7)  with  starting 
points  0,  -1,  and  -5  at  K = 0 leads  to  three  root  loci  shown 
in  Fig.  4.  It  turns  out  that  all  4 singular  points  are 
located  on  the  root-locus  plot  and  they  are  all  classified 
as  even  singular  points  with  q = 2.  It  is  thus  necessary 
to  change  the  direction  of  the  root  locus  when  each  singular 
point  is  approached.  For  branch  1 , when  is  approached, 
the  tangential  direction  As  = -e  and  Az^  is  chosen  as  (-eje^71^ 
= -e j . Similarly,  when  branch  1 approaches  Q^,  and  , 
the  changes  in  direction  are  chosen  as  AZ2  = 

Az^  = (-e)  f and  - (jejejTr/2^  respectively.  Other 

branches,  denoted  by  2 and  3,  are  obtained  in  a similar 
manner. 

Example  2: 

As  an  example  of  the  case  with  multiple  poles  consider 

G ( s ) H (s ) = - _ 

(s  + 3)  (s  + 1)  ^ 

where  s = -1  is  a repeated  pole  with  multiplicity  r = 2. 

From  ( 14  ) 
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w = 1+ee^^2 
o 

- l+ce:i(*+2,',/2 


where 


pe 


j<J>  = _ 


s + 3 


0 . 5e 


3* 


s 


-1 


and  e is  chosen  as  0.2  for  illustration.  Thus  the  two 
approximate  starting  points  are 


wQ  = -1  + j0.2,  K = 0.08 

W1  = -1  -j0.2,  K = 0.08. 

The  root  loci  are  obtained  by  using  (8)  with  s(0)  = wq,  w^,  -3 
and  K ( 0)  * 0.08,  0.08,  0,  respectively.  The  results  are  shown 
in  Fig.  5 where  the  root  loci  are  plotted  up  to  K = 5. 

Example  3 : 

In  this  example,  the  open-loop  transfer  function  is 
assumed  to  have  one  real  pole,  and  two  complex  conjugate 
poles: 

G(s)H(s)  - - 

s(s  + 3 + j/I)  (s  + 3-j/I) 

There  is  only  one  odd  singular  point  with  q =*  3 located  on 
the  root  loci.  It  is  thus  necessary  to  jump  over  the  singular 
point  when  it  is  approached  and  Az  is  chosen  in  the  tangential 
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direction  of  the  locus.  The  complete  root-locus  plot  is  shown 
in  Fig.  6. 

Example  4 : 

The  final  example  demonstrates  the  case  where  the  singular 
points  are  located  off  the  real  axis.  Consider 


G ( s)  H ( s) 


A (s) 
B ( s ) 


K 

(s+l)2(s+l+j/X8)  (s+l-j/Iff) 


It  is  seen  that 

g (s)  = B (s)  + KA(s)  = s4  + 4s3  + 24s2  + 40s  + 19  + K. 


Setting  the  derivative  of  g with  respect  to  s to  zero,  yields 
three  singular  points,  namely,  s1  = -1,  s2  = -l+j3  and 
s3  = -l-j3.  Since  s^  = -1  is  a repeated  open  loop  pole,  it 
can  be  taken  care  of  as  a starting  point.  At  s2  and  s^,  it 
is  easily  verified  that 


Img(s2)  = Img(s3)  = 0 


and 


t*  0 . 

s=s^  or  s2 


This  indicates  that  both  s 2 and  s3  are  located  on  the  root 
loci  and,  furthermore,  they  are  classified  as  even  singular 
points  with  q = 2.  Application  of  the  proposed  root-locus 
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plotting  procedure  with  four  starting  points 

s10  3 -1  + i*'  K " °*18  ' e 

s20  = -1  - je'  K = °*18  ' e 

s30  * -1  + j/r?'  K * 0 

s40  = -1  " 3^  K = 0 

leads  to  the  complete  root-locus  plot  shown  in  Fig.  7.  It 
is  clear  from  this  example  that  a necessary  condition  for 
the  existence  of  complex  singular  points  is  that  the  order 
of  the  open-loop  transfer  function  be  greater  than  or  equal 
to  4. 

Conclusion 

An  algorithm  for  generating  the  root-locus  plot  has 
been  presented.  Classif ication  of  singular  points  has  also 
been  discussed  in  detail.  It  is  shown  that  the  conventional 
breaJc-in  and  break-away  points  are  just  special  cases  of  even 
singular  points.  The  computer-aided  method  successfully 
solves  the  problem  of  discontinuity  of  the  direction  of  the 
locus  at  singular  points  and  enables  one  to  plot  the  root 
loci  without  missing  or  repeating  any  branch. 


=»  0.1 

= 0.1 
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Abstract 

This  paper  considers  the  estimation  of  mixed  rotational  and  doubly  sto- 
chastic Poisson  process  observables.  The  estimation  structure  for  the  general 
nonlinear  problem  is  obtained  using  Kushner's  method  for  the  development  of  the 
differential  equations  for  the  evolution  of  the  conditional  probability  density 
functions. 

Introduction 

There  are  problems  in  which  the  measurement  subsystems  observing  the  state 
of  a system  are  of  fundamentally  different  types.  The  measurements  may  be  a point 
process  whereby  the  number  and  times  of  arrivals  contain  information  about  the 
state  of  the  system.  The  measurements  may  contain  additive  noise  or  they  may  be 
rotational  processes.  Estimation  of  random  point  processes  is  considered  in  de- 
tail in  the  book  by  Snyder  [1].  Estimation  of  rotational  processes  has  been 
studied  in  detail  in  references  [3-8],  among  others. 

Many  problems  occur  in  actual  applications  in  which  there  is  a mix  between 
the  various  types  of  measurement  subsystems.  For  example,  in  reference  [2],  the 
problem  of  estimation  of  random  thrusts  in  a solar  electric  propulsion  spacecraft 
has  a mixture  of  both  doubly  stochastic  Poisson  process  measurements  from  a photon 
counting  star  tracker  and  additive  noise  measurements  from  Doppler  signals.  The 
theory  for  this  mixture  of  estimation  has  been  developed  in  reference  [9]  for  the 
linear  case  and  [11]  for  the  nonlinear  case,  and  independently  by  a different 
method  in  reference  [2]. 

In  this  short  paper,  the  problem  of  estimation  of  mixed  rotational  and  point 
process  observables  is  solved.  In  particular,  the  measurement  subsystems  consist 
of  two  different  types.  The  first  type  is  that  of  a doubly  stochastic  Poisson 
process  and  the  second  type  is  that  of  a rotational  process.  The  equation  for 
the  evaluation  of  the  conditional  probability  density  function  for  the  state 
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and  the  state  modulo  2tt  is  given,  along  with  the  functional  forms  for  the 
density  function.  A general  equation  is  given  for  the  evolution  of  moments 
and  from  this  a differential  equation  for  the  evolution  of  the  conditional  mean 
is  given.  Exponential  Fourier  series  [8]  is  used  to  develop  approximate  esti- 
mations for  the  case  when  the  intensity  rate  is  a periodic  process.  These  re- 
sults are  contained  in  the  long  version  of  this  paper. 

Problem  Statement 

This  section  contains  the  problem  statement  for  optimal  and  suboptimal  esti- 
mation of  mixed  rotational  and  point  process  observables.  The  system  and  measure- 
ment models  are  given. 

Consider  the  probability  space  (ft,  A,  P).  The  system  equation  is  given  as 


dx(t)  = f [x ( t ) , t]dt  + g[x(t),  t]de(t)  (1) 

where  x:  ft  -*■  R and  is  Borel  measureable,  3 is  a Wiener  process  with  unity  variance 
parameter,  f:  R'  x R'  -+■  R'  and  is  Borel  measurable.  One  of  the  measurements  sub- 
systems observes  the  state  x through  the  transformation  which  folds  z(t)  around 
the  unit  circle.  This  measurement  process  as  given  in  [3]  is  either  in  the  form 
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which  maps  the  process  onto  the  unit  circle.  This  type  of  measurement  process 
is  explored  in  detail  in  the  series  of  papers  by  Lo  and  Will  sky  [3,4,5]. 

The  process  x(t)  is  observed  by  another  measurement  subsystem  which  yields 
a counting  measurement  from  a doubly  stochastic  Poisson  process  N(t)  with  a 
periodic  intensity  rate  of  the  form 

x[x(t),  t]  where  x = R1  x R1  -+■  R1 

The  probability  of  n counts  in  an  internal  of  time,  conditioned  on  the  intensity 
rate,  is  given  as 

Pr  { -N^  = n|  X [x(a),  a],  s < a <_  t}  = 

(/$X[x(a) , a]  da)n  exp(-/^x[x(a , a)]da)  (6) 

n! 

where  N(t)  is  the  total  number  of  counts  from  t to  t.  It  is  assumed  that  the 
processes  nt  and  wt  are  indeptendent. 

Let  T.  be  the  sub-  -algebra  of  A generated  by  Zt  a {ZCt)^,  t < x <_  t}  and 
let  & (N(x),  t £ Y i t}.  Since  the  information  contained  in  the  process  Z 
is  the  same  as  the  process  z*"  & z(T)>t0lTlt  (See  reference  [3])  due  to  the 
bijectivity  of  the  Lie  group  mapping,  the  sub-a-algebra  generated  by  is  the 
same  as  that  by  Zt.  Let  Bt  = be  the  smallest  o-albegra  containing  Tt 

and  W^..  Then  the  estimation  problem  to  be  solved  is  to  obtain  the  estimates  of 
x(t)  given  the  o-algebra  8^. 

The  next  section  contains  the  equations  for  the  evolution  of  the  conditional 
density  functions,  an  equation  for  the  evolution  of  moments,  and  the  solution 
for  the  differential  equation  for  the  conditional  mean. 

Solution  of  the  Optimal  Estimation  Problem 

The  conditional  probability  density  function  for  x(t)  given  Bt  may  be 


written  as 


(7) 


P'x[(t)|Bt] 


E{expij/t|x(t)  = x}Px[x(t)] 
Elexp^} 


where 


+ f ^[a)’  ^ [Zt(a)dZ(a)  ]]2 


- / £ x[x(a),  a]da  + In  x[x(a),  ojdN^ 


(8) 


where  ^-denotes  an  lt0  integral  and  the  last  integral  is  a counting  integral  [1]. 
This  may  be  proved  by  using  the  fact  that,  given  x(t),  the  two  measurement 
sybsystems  are  independent;  the  representation  theorems  of  [1]  and  [3]  can 
then  be  applied.  The  foled  density  may  be  written  as 

P0[9(t)|8t]  * 

» E(exp4<t| x(t)  = 9 ( t ) + 2kir}P  (e(t)  + 2kir) 

v £ 

k=-°°  Eiexp^1 } ,Q» 


In  a similar  manner  as  [2]  or  [3],  the  stochastic  partial  differential 
equations  for  the  temporal  evolution  of  the  conditional  density  functions  may 
be  derived  for  P[x(t)|Bt]  and  P[e(t)|Bt]. 

Theorem  1:  The  stochastic  parial  differential  equation  for  the  evolution  of 
Px[x(t)|8t]  is 

dPx[x(t)|Bt]  = LPx[x(  t ) 1 8 t]dt  + { h[x  ( t ) , t]  - 

h[x(t),  t]dt>  { [Z(t)TdZ(t)]-j 2 - h[x(t),  t]dt} 

-PxCx(t) | B t]  + (x[x(t),  t]  - X[x(t),  t] } 

•x[x(t),  t]-1  (dN(t)  - x[x(t),  t]dt)Px[x(t)|Bt] 


(10) 
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where 

l(.)  . H-f)  t % »2(-g)  (1]) 

ax2 

and 

\ 

h[x(t),  t]  = E{h[x(t) , t]|8t>.  (12) 

Proof:  The  proof  may  be  obtained  by  using  Kushner's  method  [10],  and  it 
follows  as  in  reference  [2]. 

The  differential  equation  for  P[0(t)|8t]  follows  as  in  [3]  and  is 

oo 

dP0[9(t)|  t3  = z dP  [e(t)  + 2irn|Bt].  (13) 

n=-°° 

Theorem  2:  Let  (J)[x(t)]  be  a twice  continuously  differentiable  scalon  function 
of  the  state  x(t)  which  is  the  solution  of  equation  (1).  Then  the  differ- 
ential equation  for  the  8t  conditional  expectation  of  Ip , E Bt{(|>[x(t)]} , is 
given  as 

8 8 

dE  t{4»Cx( t)]}  = E t(4x[x(t)]  f[x(t),  t]}dt  + 

8.  n 

+ H E r{g  [x(t),  t]$xx[x(t)]}dt  + 

8t 

+ (E  t{Kx(t)]h[x(t),  t]>  - 
8 B 

- E t{|[x(t)]  )E  t{h[x(t),  t]}} 

"r(irr  t[Z(t)TdZ(t)  ^12  ■ 

8 8 

- E t{h[x(t),  t]}dt}  + (E  t{^[x(t)] 

8 8 
. X[x(t),t]}  - E t{Hx(t)]}  E 1 

8t 

. (<Kx(t)]}  E lU[x(t),  t]}} 

8t  1 

. E l{A[x(t),  t]}  'tdN(t)  - 

8t 

- E x{A[x(t),  t] }dt } 


■MM ■ 
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Proof:  This  follows  from  reference  [2]. 

Theorem  3:  Given  the  system  and  measurement  subsystems  as  previously  defined, 
the  equation  for  x(t),  E{x( t)  | B^. } , may  be  written  as 

8 8 
dx(t)  = E *{f  [x(t),  t]}dt  + ^ E t{[x(t)  - 

- x(t)  ]h[x(t) , t]}  {[Z(t)TdZ(t)T]12  - 

8 8 

- E ^{htxCt),  t]}dt}  + E t{[x(t)  - x(t)] 

8t  -1 

. A [x ( t ) , t]}  E r{A[x(t),  t]}  { dN ( t ) - 

B, 

- E t(A[x(t),  t]}  dt}  . 

The  covariance  of  the  estimation  error  may  be  written  as 
Bt 

dP(t)  = 2E  t{(x(t)  - x(t)f[x(t),  t]}dt  + 

Bt  2 8t 

+ E tg[x(t),  t]^}dt  + E r{ ( x( t ) - 

- x(t ) )^  (h[x(t),  t]  - E t{h[x(t),  t] } ) } 

8 

.{[Z(t)TdZ(t)]12  - E (h[x(t),  t]}  dt} 

g 

- E t{(x(t)  - x(t))h[x(t),  t]}2dt  + 

+ E t{(x(t)  - x(t))2  (x[x(t) , t]  - 

8 8 
E t{A[x(t),  t]}>  E t{A[x(t),  t]}-1 

8 8 
. { dN( t ) - E t{A[x(t),  t] }dt } - E t{(x(t)  - 

g 

- x(t))A[x(t),  t]}2E  t{A[x(t),  t]}'2dN(t). 

Proof:  Theorem  2 may  be  used  directly. 
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Since  the  estimator  of  Theorem  3 is  infinite  dimensional,  it  is  necessary 
to  consider  approximate  suboptimal  estimators. 

An  approximate  estimator  structure  may  be  obtained  by  using  moment  approxi- 
mations, as  in  [1]  and  [12].  It  will  be  assumed  that  the  estimate  is  unbiased 
and  the  fourth  order  moments  of  the  estimation  error  can  be  factored  into 
products  of  second  order  moments  similar  to  Gaussian  moments.  Furthermore, 
all  remaining  moments  of  the  estimation  error  other  than  the  second  will  be 
eliminated.  With  these  assumptions  the  approximate  estimation  structure  may 
be  written  as 

dx(t)  = f [x(t),  t]dt  + |^{[Z(t)TdZ(t)]12 

- h[x(t),  t]dt}  + P(t)  |* 

. U[x(t),  t]>_1 CdN(t)  - X[x(t),  t ]dt } 
where  the  approximate  estimation  error  covariance 

dP(t)  = { 2P ( t ) f£  |*  + g2 

- p(t)^  - — > t]_  P.(t ) . (liii -)^}(jt  + 

a;2  rttT  3x ^ x 

oX 

+ P(t)2  dN(t) 

3X  * 


Conclusion 

This  paper  considers  the  problem  of  estimation  using  mixed  observables 
when  the  observables  are  of  two  fundamentally  different  types.  The  first 
type  is  that  of  a rotational  process  and  the  second  type  is  that  of  a doubly 
stochastic  Poisson  process.  The  estimation  structure  is  obtained  using  the 
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differential  equation  for  the  conditional  probability  density  function.  A 
suboptimal  estimator  based  on  moment  approximation  is  given  for  this  problem. 

Exponential  Fourier  series  [8]  will  be  used  to  develop  additional  approxi- 
mate estimators  for  the  case  when  the  intensity  rate  is  a periodic  process. 

These  results  are  contained  in  the  long  version  of  the  paper. 
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Abstract 


It  is  shown  that  the  fixed  modes  of  an  interconnected 
dynamical  system  under  decentralized  control  are  precisely 
the  uncontrollable  and  unobservable  states  of  the  individual 
system  components.  As  such,  the  system  can  be  stabilized  by 
decentralized  controllers  if  and  only  if  its  individual 
system  components  can  be  stabilized.  Moreover,  these  con- 
ditions are  shown  to  be  equivalent  to  the  conditions  for 
stabilizing  the  system  using  a global  controller. 

Introduction 

Given  a linear  system  with  partitioned  inputs  and  outputs 

. n 

X = FX  + Z B1u. 

i-1  1 

1. 

y^  = C1X  i = 1,2,  . . . , n 

it  is  desired  to  design  a family  of  dynamic  decentralized  con- 
trollers 


2. 


Z . 
i 


S. Z . + R.y. 
11  1 


i = 1,2, 


ui 


Q.Z.  + K.y. 

li  iJ  l 


n 


which  place  the  poles  (eigen  values)  of  the  resultant  feedback 
system  in  prescribed  locations.  In  its  most  general  form 

1 2 

the  solution  to  this  problem  was  given  by  Wang  and  Davison.  ' 
Their  solution  is  formulated  in  terms  of  the  (diagonally)  fixed 
modes  of  the  systems 


3. 


0d(F,B,C)  = n X (F+BKdC) 


76 


Here,  B and  C are  the  matrices  B = coltB1)  and  C = row(C1), 
respectively,  \ (M)  denotes  the  set  of  eigen  values  of  the 
matrix  M,  and  the  intersection  is  taken  over  the  set  of  block 
diagonal  (complex1)  matrices  whose  partition  is  comformable 
with  the  partitions  of  B and  C.  Using  this  concept  of  fixed 
modes,  Wang  and  Davison  ' showed  that  the  eigen  values  of 
the  system  can  be  placed  in  a prespecified  open  region  of  the 
complex  plane  using  the  dynamic  decentralized  controllers  of 
equation  2,  if  and  only  if  lies  in  that  region.  More  pre- 
cisely, they  showed  that  6 , represents  the  set  of  eigen  values 
of  F which  cannot  be  moved  by  any  family  of  decentralized 
dynamic  controllers  while  all  remaining  eigen  values  of  F 
can  be  arbitrarily  placed  by  an  appropriate  choice  of  decen- 
tralized dynamic  controllers. 

If  one  observes  that  F+BK^C  is  just  the  state  matrix  for 
the  given  system  with  the  static  decentralized  feedback  matrix 
the  above  result  can  be  interpreted  as  a characterization  of 
the  eigen  value  placement  properties  of  the  system  under  dynamic 
decentralized  control  in  terms  of  its  eigen  value  placement 
properties  under  static  decentralized  control.  Indeed,  the 
theorem  of  Wang  and  Davison  states  that  those  eigen  values 
which  can  be  moved  at  all  by  static  controllers  can  be  ar- 
bitrarily placed  by  dynamic  controllers  whereas  those  eigen 
values  which  are  fixed  under  all  static  controllers  are  also 
fixed  by  dynamic  controllers. 

‘Precisely  the  same  theory  can  be  formulated  for  systems  characterized  by 
real  matrices  though  in  that  case  the  arguments  are  complicated  by  the  fact 
that  one  must  work  with  pairs  of  complex  conjugate  eigen  values  to  pre- 
serve reality. 


Since  the  partitioning  in  equation  1 is  arbitrary,  the  77 
above  described  theorem  can  be  applied  to  the  classical  case 
wherein  the  given  system  has  only  a single  input  and  output 
in  which  case  the  fixed  modes  of  the  system  are  given  by 

4.  9 (F, B,C)  = n X (F  + BKC ) 

where  the  intersection  is  now  taken  over  arbitrary  matrices, 

K,  which  are  conformable  with  B and  C.  Of  course,  in  this 
special  case  9(F,B,C)  reduces  to  the  usual  set  of  eigen  values 
which  are  either  uncontrollable  or  unobservable.4  Moreover, 

5.  9 (F,B,C) G 9d (F,B,C) 

since  the  intersection  used  to  define  9 is  taken  over  a larger 
set  of  matrices  than  that  used  to  define  9d>  Equation  5 form- 
alizes the  intuitively  obvious  fact  that  a system  which  is 
"controllable"  by  a family  of  decentralized  controllers  is  also 
"controllable"  by  a global  (centralized)  controller. 

The  purpose  of  the  present  paper  is  to  show  that  equation  5 
holds  with  equality  in  the  case  where  F,  B,  and  C represent  the 
dynamics  of  an  interconnected  dynamical  system4  in  which  y^ 
and  u^  denote  the  local  inputs  and  outputs  associated  with  a 
given  system  component.  As  such,  in  that  special  case  the 
eigen  values  of  the  system  can  be  placed  in  prespecified 
locations  by  decentralized  dynamic  controllers  whenever  they 
can  be  placed  in  the  same  locations  by  a global  dynamic  con- 
troller. Although  the  class  of  interconnected  dynamical  sys- 
tems is  considerably  smaller  than  the  class  of  decentralized 
systems  studied  by  Wang  and  Davison,  the  design  of  the  local 
controllers  for  the  components  of  an  interconnected  dynamical 
system  is  the  "physical  problem"  which  usually  motivates  the 


’8 


study  of  the  general  decentralized  control  problem.  As  such, 
we  believe  that  the  above  result  is  significant. 

The  class  of  interconnected  dynamical  systems  which  we 
consider  is  characterized  schematically  in  Figure  1 and 
mathimatically  by  the  set  of  equations 


Figure  1. 


Interconnected  dynamical  system 
with  local  controllers. 
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1 1 

y . 
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Here  the  first  two  equations  represent  the  dynamics  of  the  ith 
component,  whereas,  the  third  equation  defines  the  interconnection 
structure  in  which  the  input  to  the  ith  component  is  taken  to 
be  a linear  combination  of  the  outputs  of  the  various  components 
(including  the  ith)  and  an  external  control.  The  fact  that  the 
control  inputs,  u^ , are  not  multiplied  by  compansator  matrices, 
implies  that  the  local  controllers,  given  by  equation  2,  have 
full  access  to  the  inputs  of  the  individual  system  components, 
and,  similarly,  the  output  of  the  individual  system  components 
is  fully  accessible  to  the  controllers. 

For  notational  brevity  equation  6 may  be  restated  in  block 
matrix  form  as 

X = AX  + Ba 

7.  y = CX 

a = Ly  + u 


where  X = col(X^),  a = col(a^),  y = col(y^)  u = col(u^), 

A = diag(A^),  B = diag(B^),  C = diag(C^)  and  L = maML1-*). 
Combining  these  into  a single  equation  for  the  overall  composite 
system  we  obtain  the  composite  state  model 


8. 


where 


X = FX  + Bu 
y = CX 


9. 


F 


A + BLC 
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Moreover,  upon  observing  that 

n 

10.  Bu  = Z B1u. 

i=l  1 


and 

11.  y.  = C1X  i = 1,  2,  . . . , n 

where  B1  = col  (0,0  ...»  0,  B^ , 0,  . ..,  0)  and  C1  = row(0,  0,  ...» 
0,  , 0,  . ..,  0)  we  see  that  equation  8 naturally  decomposes 

into  the  form  of  the  decentralized  control  problem  of  equation  1. 
In  equations  8,  however,  the  B1  and  matrices  take  on  a special 
form,  whereas,  they  are  arbitrary  in  equation  1.  Indeed,  it  is 
this  special  form  which  yields  the  desired  equality  in  equation  5. 
Intuitively,  this  implies  that  the  ith  local  controller  may  drive 
only  the  state  of  the  ith  system  component  though  that  state 
may,  in  turn,  drive  the  remainder  of  the  system  through  the 
connection  equations.  Similarly,  the  ith  local  controller  may 
observe  only  the  state  of  the  ith  component  with  the  remaining 
components  being  observed  only  indirectly  through  the  state  of 
the  ith  component. 

Main  Theorem: 

Using  the  above  notation,  our  main  theorem  may  be  stated  as 
follows . 

THEOREM:  For  the  system  of  equation  8 

12.  ed(F,B,C)  = 0 (F  , B , C)  = 9 ( A , B , C ) =U  efA^B^C^) 

Proof:  To  show  that  0(F,B,C)  = 0(A,B,C)  we  simple  observe  that 


13. 


X (F+BKC)  = X (A+B(L1;l+K)C)  = X(A+BK'C) 


r 


where  K'  = + K.  As  such,  the  same  set  of  matrices  are 

spanned  if  one  takes  the  intersection  of  the  X (F+BKC)  over  K 
or  the  intersection  of  the  X(A+BK'C)  over  K'  and  hence  0(F,B,C)  = 

9 ( A , B , C ) . Moreover,  since  A,  B,  and  C are  block  diagonal  9(A,B,C) 
is  just  the  union  of  the  fixed  modes  associated  with  each  block. 

Given  5 to  prove  the  validity  of  the  first  equality  of  2,  it 
suffices  to  show  that  6d (F,B,C)  <=  9 (F, B,C)  . For  this  purpose, 
we  desire  to  show  that  if  X is  not  in  9(F,B,C)  then  it  is  not 
in  9d(F,B,C).  Initially,  we  assume  that  A,B,  and  C are  par- 
titioned as  2 by  2 matrices,  the  general  case  following  therefor 
by  induction.  If  X is  not  in  9(F,B,C)  then  there  exists  a K 
(dependent  on  X)  such  that  det  (XI  - (F+BKC))  ^ 0 and  we  desire 
to  construct  a block  diagonal  , also  dependent  on  X,  such  that 
det  (XI  - (F+BK^C) ) ^ 0.  To  this  end  we  write  out  the  matrix 
F+BKC  in  partitioned  form  and  expand  its  determinant  via  the 
formula  for  the  determinant  of  a 2 by  2 partitioned  matrix2 
obtaining 

0 ? det ( X 1 - (F+BKC))  = det(Xl  - (A  + B(L+K)C) 

r-  11  11*  1 7 12 

Xl-A^B^  C^B^  C1  j -BLL  C2-B1K  C2 

-B2L21C1-B2K^^C1  I Xl-A2-B2L22C2-B0K22C2  - 
(Xl-A1-B1L11C1-B1K11C1)det[xl-A2-B2L22C2-B2K22C2 

(E2L21C1+B2K21C1)  (X1-A1-B1L11C1-B1K11C1)  _1  (BjL^C.j+Bj.K12^  ) ] 

(Xl-A1-B1L11C1-B1K11C1)det[Xl-A2-B2L22C2-B2K22C2 

-B2L21C1 (Xl-A1-B1L11C1-B1K11C1) -1B1L12C2] 

| where 

J 


= det 

14. 

= det 


= det 
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k22  = k22+k21c1(xi-a1-b1l11c1-b];k11c1)~1b1l12c2 
+ l21c1(ai-a1-b1l11c1-b1k11c1)"1b1k12c2 
+ K21c1 (xi-a1-b1l11c1-b1k11c1)‘1b1k12c2 


16. 


r 

L11 

i _r 

' T12 

I L 

1 

L21 

1 

: l22 



and 

17. 


K = 


i — 

„n 

' . 12 

K 

i K 

1 

— 

j.21 

! ,,2  2 

K 

1 K 

are  partitioned  to  be  conformable  with  A,  B,  and  C. 
we  define  K,  via 


18. 


Kd  = 


.11 


! K22 


Now,  if 


and  compute  det  ( XI- (F+BK.C) ) we  obtain 

d 


19. 


det  (XI- (F+BKdC) ) = det(Xl-(A  + B(L+Kd)C) 


= det 


‘xi-a1-b1l11c1-b1k11c1  J -b1l12c2 


-B2L21C1  J X1-A2-B2L22C2-B2K22C2 
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det  ( l-A1-B1L1LC1-B1KLLC1)det[U-A2-B2L22C2-B2K22C2 


-b2l21cx  (.u-a1-b1l11c1-b1k11c1)  -1b1l12c2] 


= det  (XI  - (F+BKC))  7*  0 


Thus  there  is  at  least  one  block  diagonal  matrix  such  that 
X is  not  an  eigen  value  of  F+BK^C  showing  that  X is  not  in 
9d(F,B,C) . 

To  extend  the  above  argument  from  2 by  2 partitioned 
matrices  to  n by  n partitioned  matrices  we  repete  the  above 
construction  n-1  times  as  follows.  Given  an  n by  n 


K = 


1_2  _j  __  K * 3_J  ...  1 K^ 

K2  1',*  K22J  K23  | I K2n 

K31jj  K32  J K^_3'  [_K^_n 


k31]JA32  J k3_31 


Knl|j  Kn2  |~ Rn3 
'I  I 


such  that  det ( Xl-F+BKC) ) ^0  it  is  partitioned  into  a 2 by  2 
matrix  as  shown  by  the  double  line  whence  the  above  argument 
is  employed  to  formulate  a matrix 


K = 


K11}  0 I*  0 |. 

j — HI — — , 

0 K22"  K23 

0 1 K I,  K 

---  ---  -II- --  -L 


0 I K 2 1 1 K 3 |. 
I ~ ll  ~ 


I K2n 

F - 


. I K 

I “ 


such  that  det ( \1- (F+BKC) ) f 0 • This  matrix  is  then  repartitionen 
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into  a new  2 by  2 matrix  as  shown  by  the  double  line  in 
equation  21  and  the  process  is  repeated.  Since  the  1-1  entry 
in  the  partitioned  matrix  is  not  affected  by  the  process  this 
results  in  a new  matrix  of  the  form 


22. 


K 


K11  I 0 I 0 | 

0 I K22  I 0 | 

r -i  - 

0 i 0 IK33’ 


I 0 

r~~ 

I 0 

rK3n 

I =_  _ 


0 I 0 IKn3l I . . . 

L i i=  n 


such  that  ( A 1- (F+BKC)  ^ 0.  Repeating  the  process  n-1  times 

eventually  results  in  a block  diagonal  matrix  such  that 

det ( A 1- (F+BK^C)  j-  0 showing  that,  if  A is  not  in  e(F,B,C)  then 

it  is  also  not  in  9^(F,B,C)  thereby  verifying  that  e^(F,B,C) 

is  contained  in  9(F,B,C)  and  completing  the  proof  of  the  theorem. 

Given  the  theorem,  for  the  class  of  interconnected  dynamical 

systems,  local  control  is  just  as  good  a globax  control  from 

the  point  of  view  of  pole  placement.  From  the  point  of  view  of 

optimal  control,  however,  global  control  will,  in  general,  still 

4 

be  superior  since  it  gives  one  a greater  range  of  options. 
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Introduction: 

The  subject  of  two-dimensional  digital  filters  has  received  consi- 
derable attention  of  late:  in  particular,  two-dimensional  spectral  fac- 
torization has  been  treated  in  a number  of  papers  - it  is  considered  in 
great  detail  in  reference  [1].  The  major  problem  which  arises  is  that  in 
general  the  spectral  factors  of  a rational  transfer  function  are  not 
rational:  some  further  processing,  such  as  truncation  and  smoothing,  is 
usually  employed  to  yield  approximate  rational  factors.  It  is,  therefore, 
somewhat  surprising  that  the  class  of  rational  functions  for  which  a ra- 
tional spectral  factorization  exists  does  not  seem  to  have  been  investi- 
gated. In  this  paper,  we  give  two  sets  of  conditions  which  must  be  satisfied 
by  such  functions  (theorems  1 and  3);  a converse  is  given  which  may  be 
applied  to  the  numerator  and  denominator  polynomials  separately.  Now,  the 
polynomial  spectral  factors  (when  they  exist)  of  a given  polynomial  are 
minimum-  and  maximum-phase  polynomials;  conversely,  every  such  polynomial 
gives  rise  to  trivial  spectral  factors.  Motivated  by  this,  we  apply  the  re- 
sults of  theorems  1 and  3 to  the  particular  case  of  minimum-phase  polynomials 
(i.e.,  polynomials  without  zeros  in  the  unit  polydisc). 

In  this  context,  the  main  consequences  of  the  results  of  this  paper  may 
be  broadly  outlined  as  follows: 

i)  A given  polynomial  has  exactly  the  same  amplitude  response  as  a 
minimum-phase  polynomial  if  and  only  if  the  classical  one-variable 
method  (of  factoring  the  original  polynomial  into  a product  of  two 
polynomials  devoid  of  zeros  in  certain  regions)  can  be  applied. 

(This  result  is  in  fact  implicit  in  [1],  but  does  not  appear  to 
have  been  explicitly  stated  in  the  literature).  The  corresponding 
statement  for  minimum-phase,  stable  rational  functions  is  false, 
however. 
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ii)  If  the  conditions  given  in  theorems  1 and/or  3 are  not  satisfied, 
then  not  alone  is  there  no  minimum-phase  stable  rational  function 
having  exactly  the  same  amplitude  response  as  the  original;  the 
original  amplitude  response  can  not  even  be  approximated  arbi- 
trarily well  by  minimum-phase  stable  rational  functions.  This 
follows  from  the  fact  that  the  conditions  in  theorems  1 and  3 are 
conditions  on  the  amplitude  response  which  are  preserved  under 
any  reasonatie  kind  of  convergence. 

iii)  The  conditions  in  theorem  3 are  easily  visualized  and  surprisingly 
stringent;  they  require  essentially  that  the  gain  of  the  filter, 
averaged  over  certain  directions  in  the  frequency  plane,  have  no 
variation  in  a perpendicular  direction,  (see  the  discussion  follow- 
ing theorem  3).  This  gives  extremely  severe  restrictions  on  the 
amplitude  response  of  minimum-phase  FIR  filters,  minimum-phase 
stable  HR  filters,  and  the  denominator  polynomial  of  arbitrary 
stable  IIR  filters. 

iv)  It  has  been  pointed  out  by  Bose  [9]  and  Woods  [10],  and  again  is 
implicit  in  [1],  that  there  exist  pure ly  recursi ve  f i 1 ters  whose  am- 
litude  responses  are  not  real i zable  as  the  amplitude  response  of 
any  stable  purely  recursive  filter,  and  that  consequently  any 
stabilization  method  which  attempts  to  match  the  amplitude  re- 
sponse of  the  original  filter  is  doomed  to  failure.  The  restrict- 
ions referred  to  in  iii), above,  reinforce  this  conclusion  and 
identify  the  precise  properties  of  the  examples  in  [9]  and  [10] 
which  make  stabilization  impossible. 

Definitions  and  Notation: 

Our  notation  will  follow  that  in  [2];  we  repeat  it  here  for  convenience. 

For  simplicity  we  restrict  ourselves  throughout  to  two  dimensions,  although 
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there  does  not  appear  to  be  any  difficulty  in  extending  the  results  to 
higher  dimensions.  Thus  all  functions  are  assumed  throughout  to  be  rational 
functions  of  two  complex  variables  unless  otherwise  stated;  we  further  ex- 
clude the  zero  function.  Two-dimensional  complex  space  will  be  denoted 

2 . 2 

by  ,i.e,  (f  = { (Zj  • Z2)|  Zx  and  Z2  are  complex  numbers}.  The  open  unit 

2 

polydisc  will  be  denoted  by  U ,i.e., 

2 2 

U = { (Z j , Z2 ) e $ | | Z x | <1  and  jZ2|  <1} 

2 

and  its  closure  will  be  denoted  by  U : 

2 o 

' f = { (Zx , Z2)e  if  I I Zx | <1  and  |Z2|  ^1 } 

2 

The  distinguished  boundary  of  the  unit  polydisc  will  be  denoted  by  T : 

T‘=  { (Zx , Z2)e  f\  |Zx|  = 1 and  |Z2|  = 1} 

The  frequency  response  of  the  filter  whose  transer  function  is  f (Z x »Z2 ) is 

o 

L. 

simply  the  restriction  of  f to  T . We  will  find  it  convenient  to  denote  this 
restriction  by  f*. 

The  one-dimensional  sets  corresponding  to  the  above  are: 

U = (Z  e $ | |Z|  <1 } 

U = (Z  e <H  |Z|  <1} 

T = (Z  e ( | |Z|  =1} 

2 

We  need  one  further  subset  of  $ : 

V2--  { (Zj  ,Z2)  e if  | |Zx|  >1  and  |Z2|  >1 } . 

2 

By  the  Fourier  coefficients  of  a function  h (e1,e2)  defined  on  T we 


mean  the  numbers 
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2tt  2tt 

a = 1 tt  tt  -j(me1+  ne2)j„ 

mn  — 2 n(0j,02/  6 09 j 002* 

4 IT  0 0 

Finally,  let  us  state  precisely  what  we  mean  by  the  term  spectral 
factorization.  Several  different  forms  of  spectral  factorization  are  treated 
in  [1];  here  we  will  be  concerned  only  with  the  simplest  form:  if  f is  a 
rational  function,  it  will  be  said  to  have  a (rational,  quarter-plane) 

spectral  factorization  if  f = f x f 2 where  f[  and  f2  are  rational  functions, 

2 2 

fj  has  no  poles  or  zeros  in  U , and  f2  has  no  poles  or  zeros  in  V . Several 
comments  are  in  order  concerning  this  definition: 

i)  By  "rational"  we  mean  only  "finite-order";  i.e.,  the  functions 

are  assumed  to  be  expressible  as  the  quotient  of  two  (finite-order) 
polynomials. 

ii)  The  quarter-plane  property  enters  only  in  connection  with  the 

regions  in  which  the  factors  are  assumed  to  be  zero-  and  pole-free; 

2 

in  particular,  if  f has  no  poles  or  indeterminacies  on  T , and  has 
a quarter-plane  spectral  factorization,  then  there  is  a quarter- 
plane  causal,  stable  filter  whose  amplitude  response  is  eaual  to 
|f*|. 

iii)  It  would  possibly  be  more  natural  to  work  with  (J2  and  V2  rather  than 
U2  and  V2  (especially  when  considering  stability).  However,  to  do 
so  would  complicate  the  statements  of  the  theorems  considerably,  and 
it  is  usually  clear  whether  or  not  the  results  will  hold  with  U2  and 

V2  in  place  of  U2  and  V2.  (One  needs  only  to  check  for  zeros  and 

2x 

poles  on  T ).  In  general,  if  the  "closed"  version  is  not  obvious, 
it  is  not  true;  1 - Zx  Z2  will  serve  as  a counterexample  in  all 


such  cases. 
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iv)  To  simplify  the  statements  of  the  theorems,  the  definition  has 

been  given  in  terms  of  the  rational  function  f itself,  rather  than 

2 

the  spectral  function  |f*|  ; however,  the  conditions  given  in  the 

theorems  actually  involve  only  f*1  . 

. 2 2 

v)  We  note  that  V is  defined  to  be  a subset  of  C i thus  the  behaviour 

of  functions  at  infinity  is  irrelevant  to  our  purposes. 

Spectral  Factorization: 

Our  first  criterion  for  the  existence  of  rational  spectral  factors  is 
very  much  in  the  spirit  in  which  spectral  factori zat ion  is  treated  in  [l1. 
it  is  a trivial  consequence  of  theorem  5.4.7  in  [2]. 

Theorem  1_: 

If  a rational  function  f on  ( nas  a rational  spectral  factorization 

then  the  Fourier  coefficents  a of  log  f*  are  zero  for  all  pairs  of  in- 

mn 

tegers  (m,n)  such  that  m f o,  n t o,  and  m and  n have  different  siqns  - 
that  is,  for  all  inteqer  points  in  the  second  and  fourth  quadrants  The 
converse  is  true  for  polynomial  f. 

As  mentioned  above,  this  criterion  involves  only  the  absolute  value 
of  f;  it  follows  that  the  existence  of  spectral  factors  imposes  restrictions 
on  the  amplitude  response  of  a two-dimensional  filter  - in  contrast  with 
the  situation  in  one  dimension.  The  above  criterion,  however,  does  not 
present  these  restrictions  in  an  easily  visualized  form  - for  instance,  it 
is  difficult  to  guage  exactly  how  severe  the  restrictions  are.  For  this 
reason,  we  next  present  conditions  which  are  stated  in  terms  of  the  log- 
amplitude  response  itself,  rather  than  Its  Fourier  coefficients.  This 
result  takes  an  approach  which  seems  to  differ  substantially  from  those 
previously  known;  it  gives  easily  visualized  necessary  conditions  on  those 
rational  functions  which  admit  a rational  spectral  factorization.  Before 
we  state  this  theorem,  however,  we  first  present  a simple  result  which  will 


95 


be  used  in  the  proof,  and  is  also  of  separate  interest;  one  of  its  conse- 
quences is  that  when  rational  spectral  factors  exist,  the  usual  one-dimensional 
stabilization  method  (for  unstable  denominator  polynomials)  can  be  used. 

Theorem  2: 

If  the  rational  function  f admits  a rational  spectral  factorization, 

% % 

then  there  is  a rational  function  f (with  deg  f < deg  f)  such  that 

|f*l  * |f*| 

and  f has  no  poles  or  zeros  in  U . 

Again,  the  converse  holds  for  polynomial  f. 

’hus.  if  the  denominator  polynomial  of  an  unstable  filter  has  polynomial 
central  factors,  there  is  a stable  filter  of  at  most  the  same  order  with 
the  same  amplitude  response  (provided  the  polynomial  has  no  zeros  on  T ) . 

Again  most  of  the  proof  is  contained  in  [2];  we  fill  in  the  details 

p 

here  suppose  f has  rational  spectral  factors,  then  f • f(  ^ where  ft  has  no 
poles  or  zeros  in  U and  P and  0 are  polynomials  without  zeros  in  V . 
m n 

Let  P • Z,  Z.  P (1/Z, • 1/Z2 ) • Z3  t 0.  Z,  t 0 

where  m is  the  degree  of  P in  Z . n is  the  degree  of  P in  Z^,  and  P is  the 
polynomial  whose  coefficients  are  the  complex  conjugates  of  the  coefficients 

of  P.  Clearly  P is  a polynomial  of  degree  less  than  or  equal  to  the  degree 

•v 

of  P,  and  so  is  also  defined  for  Z]»0  and  Z2»0.  Now  if  P(ZltZ2  * 0 for 
Z,  t 3 and  Z2  t 0,  then  P(l/Zlt  1/Z2)  ■ 0;  this  implies  that  either 

2 

| 1/Z! | < 1 or  |1/Z2|  <.  1 (since  P has  no  zeros  in  V ) 

|Z,|  >1  or  1 Z2 | I 1.  i-e.,  (Zlt  Z2)  i U 


and  so  either 
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Thus  the  only  possible  zeros  of  P in  U are  for  lx  = 0 or  Z2  = 0.  But 
by  standard  results  in  the  theory  of  several  complex  variables  [8],  if 
the  zero-set  were  nonempty,  this  would  imply  that  either  lx  or  Z2  was  a 

factor  of  P,  which  is  impossible  by  our  choice  of  m and  n.  Thus  P has 

2 2 

no  zeros  in  U . Finally,  on  T 

'v  m m 

|P(Z!,Z2) [ = | Z ! Z,  P (1/Zi ,1/Z2) I = | P ( Z ! , Z2 ) | = |P(Zi,Z2)|. 

% 

Q is  defined  similarly  and  has  similar  properties.  Then 


% 


clearly  has  the  required  properties. 

Conversely,  suppose  f is  any  polynomial  for  which  there  is  a 
•v  2 

rational  function  f without  poles  or  zeros  in  U such  that 

|f*|  * |f*| 

then  f/f  is  rational  and  analytic  in  U , and 


|(f/f)*i  - 1 

Thus  by  theorems  5.2.5  and  5.2.6  in  [2],  f/f  3 P/Q  where  P and  Q are 

2 2 

polynomials,  P has  no  zeros  in  V , and  Q has  no  zeros  in  U . Then 

f 3 Pf/Q 

gives  a rational  (in  fact,  polynomial)  spectral  factorization  of  f. 
The  Second  Criterion: 

Our  second  set  of  conditions  for  the  existence  of  a rational 


spectral  factorization  is  given  in  the  following: 
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Theorem  3: 

2 

If  a rational  function  f on  (f  admits  a rational  spectral  factori- 
zation, then 


l | l’  tog  |f{e^me,  ej(n  e * ’>)|de 

is  a constant  independent  of  (0  £ ¥ < 2 it),  for  all  integers  m > o and 
n > o. 

Again,  these  conditions  depend  only  on  the  amplitude  response  of  f. 

The  simplest  condition  is  that  for  m = 1 and  n = 1;  it  can  be  easily  visua- 
lized by  drawing  two  adjacent  squares  in  the  e2  - plane  on  which  the 
amplitude  response  is  defined  (the  frequency  response  extends  to  the  entire 
3 i $2  - plane  by  periodicity),  and  drawing  lines  with  slope  1 and  length 
2tt*7  on  these  squares;  see  figure  1. 

Then  the  condition  for  m = 1,  n = 1 can  be  restated  as:  the  "average" 
amplitude  of  the  function  f along  the  line  Li  is  a constant  - that  is,  it 
is  as  independent  of  the  particular  line  L..  chosen.  ("Average"  here  is  to 
be  understood  as  the  geometric  mean  of  the  amplitude,  or  the  arithmetic 
mean  of  the  log-amplitude).  Alternatively,  we  may  say  that  the  average 
level  of  the  amplitude  over  any  line  of  slope  1 and  of  length  2irv7  is  in- 
dependent of  the  position  of  the  line  in  the  9i02-plane.  (For  example,  we 

A 

could  vary  the  over  the  dotted  square  in  the  direction  in).  The  condi- 
tions for  higher  m and  n have  a similar  interpretation,  with  a slope  of 
n/m  instead  of  1,  and  length  2ir/m2  + n2  instead  of  2*/?;  clearly,  if  m and 
n are  not  relatively  prime,  the  corresponding  condition  is  superfluous. 

This  theorem  then  gives  a striking  limitation  on  the  amplitude  re- 
sponse of  a rational  function  which  admits  a rational  spectral  factori- 
zation; even  the  simplest  of  the  conditions  (that  for  n=m=l)  implies  that 


r 
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such  a function  can  not  accurately  approximate  an  amplitude  which  has 

A 

large  variations  in  overall  level  in  the  direction  n shown  in  figure  1. 
Proof  of  Theorem  3: 

In  view  of  theorem  2,  it  suffices  to  prove  this  under  the  assumption 

2 

that  f has  no  poles  or  zeros  in  U . This  assumption  implies  that  f has  a 

2 

holomorphic  logarithm  in  U . Then,  for  any  integers  m > o,  n > o and  any 
real  number 

log  f(Zm,  Zn  ejv) 

is  a holomorphic  function  of  one  complex  variable  for  Z e U.  Thus 
Re (log  f(Zm,  Zn  ejT)) 

is  a harmonic  function  in  U,  and  so  by  the  mean-value  property  of  harmonic 
functions 


I 

2n 


Re  (log  f(Zm,  Zn  ejf))  de  = Re  (log  f(om,  on  ejT)) 
T 


2 TT 

i.e.,  X Re  (log  f(e^me,  e^ne+H^))  de  = Re(log  f(o,o)) 

2-  .0 

But  Re  log  w = log  |w|  for  w + o,  and  so 


^ q log  | f(e^me,  eJ  (n9  + '*') ) | de  = log  |f(o,o)| 

and  the  right-hand  side  is  independent  of  Y (and,  incidentally,  of  m and 
n also). 

An  obvious  question  which  arises  is  the  extent  to  which  the  converses 
of  these  results  hold.  In  fact,  the  converse  of  Theorem  3 holds  for  poly- 
nomials, and  modified  converses  of  both  Theorems  1 and  3 hold  even  for 
rational  functions.  The  modification  takes  the  following  form:  If  the 
Fourier  coefficients  of  log  |f*|  (where  f is  a rational  function)  vanish 


L 


A 
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% 

for  mn  < o,  then  there  is  a rational  function  f with  rational  spectral 
factors, (equivalently,  a rational  function  without  poles  or  zeros  in  U2),  such 
that  |f*|  = |f*|.  (A  similar  statement  holds  for  Theorem  3).  However, 
the  proofs  of  these  converses  involve  some  technical  analytic  details,  and 
so  are  relegated  to  an  appendix. 

The  modification  in  the  above  converses  lies,  of  course,  in  the  fact 
that  we  cannot  conclude  that  f itself  has  rational  spectral  factors;  thus 
there  are  some  rational  functions  which  can  be  stabilized  without  changing 
the  ampl itude  response  but  to  which  the  classical  1-variable  factorization 
technique  cannot  be  applied.  A simple  example  of  this  is  the  function 


f(Zi,  Z2)  = 


Zi  + Z2  - 1 

Zi  + Z?  - Z1Z9 


Here,  |f*|  is  identically  1,  and  so  has  trivial  spectral  factors;  but  f it- 
self clearly  does  not. 

Although  the  converses  of  theorems  1 and  3 are  proved  in  the  appendix, 
there  is  another  result  related  to  the  converse  of  Theorem  3;  by 
strengthening  the  conditon  for  m=n=l  alone,  we  can  get  a stronger  converse 
for  polynomials.  Before  we  state  this  converse,  however,  we  first  give  a 
stability  criterion  (used  in  the  proof  of  the  converse)  which,  although 
previously  known,  [3],  has  not  appeared  in  the  engineering  literature. 

Although  not  as  sharp  (in  terms  of  dimension)  as  some  other  known  criteria  [4], 


it  has  two  advantages  which  make  it  useful  for  theoretical  purposes:  first, 
it  is  given  in  terms  of  a one-parameter  family  of  discs  without  the  lower- 


dimensional test  in  [5];  and  second,  unlike  most  other  stability  tests,  which 

conclude  the  nonvanishing  of  a polynomial  on  U from  its  nonvanishing  on 

_2  2 

some  subset  of  U which  contains  T , this  test  allows  the  polynomial  to 
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2 

not  vanish  on  U . The  criterion  is: 
Theorem  4: 


Suppose  a polynomial  f has  no  zeros  in  the  set 

2 


t(Zi,  Z2)  e U 


then  f has  no  zeros  in  U . 


|Z2|>; 


This  is  proved  in  a much  more  advanced  context  in  [3];  however,  it 
can  also  be  easily  proved  by  applying  one  of  the  criteria  in  [4]  to  i:he 
polydiscs 


uj  = {(Zi,Z2)  c (2 


lzil  < r>  U2I  1 r} 


for  o<r<l . 

For  the  hypotheses  imply  that  f has  no  zeros  on  the  distinguished  boundary 
2 

of  Ur  (for  o<r<l),  and  none  on  the  set 

{(Zi,z2)  e (2|z1=z2)  n \ 

Thus  by  theorem  5 in  [4],  f has  no  zeros  in  Ur  for  any  r < 1,  and  so  f has 
2 

no  zeros  in  U . 

We  can  now  state  and  prove  the  partial  converse  to  theorem  3. 


Theorem  5: 

If  f is  a polynomial  with  the  property  that 


1 

Ztt 


J(e+f) 


) de  = log  |f(o,o)|<for  o<r<2n 


j 


2 

then  f has  no  zeros  in  U . 


Thus  we  strengthen  the  condition  for  m=l  and  n=l  in  theorem  3 by 

specifying  that  the  constant  in  question  is  to  be  log  |f(o,o)|:  it  then 

follows  not  only  that  f has  rational  spectral  factors,  but  that  it  is 

2 

actually  zero-free  in  U . 


Hroor : 

By  theorem  4,  it  suffices  to  prove  that  f has  no  zeros  in  the  set 
{(Zi ,Z2)  e U j | Zx | = |Z2| } 

But  this  set  is  the  union  of  the  open  discs 
{(Zi,Z2)|z2  = ejfZlt  | Z ! | <1  },for  o < T < 2n; 


we  therefore  wish  to  prove  that  f has  no  zeros  in  any  of  these  discs; 
or  equivalently  that  the  function  f^  of  one  variable  defined  by 
f^  (Z)  = f(Z,Zejy)  has  no  zeros  in  the  open  unit  disc.  Applying  Jensen’s 
formula  [6,  p.299]  for  the  unit  disc  to  fy,  we  get 


1 

2tt 


f,  <eJ'9) 


de  = log 


yo} 


- I log 


where  the  summation  is  over  all  the  zeros  (counted  with  multiplicity)  of 
fy  in  the  unit  disc.  Expressing  this  in  terms  of  f: 


1 
2 7T 


pj(e+f) 


de  = log 


f(o,o) 


— I log  IZi 


and  so  £ log  | Z . | = 0. 

Since  for  any  Z^  in  the  open  unit  disc  log  | Z - | < o,  the  conclusion  follows. 
(It  is  clear  from  the  proof  that  we  always  have 


1 

2ir 


,(eje_ej(e*r) 


de:  >_  log 


f(o,o) 


it  follows  from  this  that  in  fact  the  apparently  weaker  condition 

log  | f(e^01 .e^®2) |de1de2  = log  |f(o,o)| 

2 

is  sufficient  to  guarantee  that  f is  zero-free  in  U . See  [2,p.73]). 

Stable  IIR  Filters  and  Minimum-phase  FIR  Filters: 

The  very  close  relationship  of  spectral  factorization  to  the  nonvanishing 
2 

of  polynomials  in  U , and  thereby  to  stable  IIR  filters  (via  the  denominator 
polynomial)  and  minimum-phase  filters  (via  the  numerator  polynomial)! is  al- 
ready clear  from  the  previous  sections.  The  force  of  theorem  2 is  that 
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purely  from  the  point  of  view  of  amplitude  response  transfer  functions  having 

rational  spectral  factors  are  equivalent  to  those  without  poles  or  zeros  in 
2 

U . Thus  the  restrictions  on  amplitude  response  in  theorems  1 and  3 apply 
to  the  denominator  polynomial  of  any  stable  HR  filter;  the  contribution  of 
the  denominator  polynomial  to  the  overall  amplitude  response  of  the  filter 
(in  the  case  of  an  all-pole  filter,  the  entire  amplitude  response)  must 
satisfy  the  restrictions  imposed  by  theorems  1 and  3.  We  have,  therefore, 
identified  the  properties  of  the  amplitude  response  which  make  it  impossible 
to  stabilize  a filter;  if  the  original  amplitude  response  has  large  .'overall 
variation  in  the  "wrong"  directions,  attempting  to  find  a stable  filter  which 
closely  matches  this  response  is  futile.  Close  matching  of  the  amplitude 
forces  instability.  This  has  already  been  shown  by  example  by  Bose  [9]  and 
Woods  [10]:  we  now  see  that  it  is  the  variations  in  the  amplitude  response 
in  the  "wrong"  directions  in  their  examples  which  accounts  for  their  be- 
haviour. 

It  is  also  of  interest  to  note  that,  in  the  Shanks  procedure  of  minimizing 
||  | f g-1 | 2 de1d92 

over  all  polynomials  f of  given  degree  (where  g is  the  original  polynomial), 

if  the  allowable  f's  were  restricted  to  those  which  have  polynomial  spectral 

2 

factorizations,  the  procedure  would  yield  a polynomial  devoid  of  zeros  in  U . 

It  does  not  appear  that  this  observation  can  be  used  as  the  basis  for  a work- 
able stabilization  method,  however,  since  the  condition  that  f have  polynomial 
spectral  factors  is  intractabl y nonl inear  in  the  coefficients  of  f;  and  further, 
in  many  cases  this  procedure  would  yield  an  f which  was  only  marginally  stable. 
For  the  same  reasons,  restricting  oneself  throughout  the  design  procedure  to 
polynomials  which  satisfy  the  condition  in  theorem  5 does  not  appear  to  be  a 
feasible  method  of  ensuring  stability. 
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Examples  and  Comments: 

An  example  of  the  behaviour  of  those  polynomials  not  possessing  poly- 
nomial spectral  factors  has  already  appeared  in  the  literature,  although 
in  a different  context;  we  repeat  this  example  here. 

A(Z!,Z2)  = 1 - - 75Zx  + -9Zi  + 1 . 5Z2  - 1.2ZXZ2  + 1.3ZiZ2  + 1.2Z2  + ^Z*  + .5ZiZ2 


This  polynomial  was  studied  in  [7];  the  associated  Shanks  polynomial  was 
found  to  be  stable  but  to  have  a substantially  different  amplitude  response  • 
from  that  of  A (for  more  details,  see  [7]).  The  fact  that  A does  not  have 
polynomial  spectral  factors  was  established  by  checking  the  condition  in 
theorem  3 for  m=n=l  and  f = o,  'H  - it,  with  the  following  results:  (correct  to 
nine  decimals) 


1 

2tt 


1 

2ir 


d0  = .696570700 


e3(9+7T) ) | d0  = 1.134686936 


As  an  example  of  a polynomial  with  rational  spectral  factors,  we  have 

B(Z!,Z2)  = 1 + 2.25Zi  + 2.25Z2  + . 5Z2  + .5Z2  - 6.5ZXZ2  - z\zz  - ZxZz 

2 2 
- 4ZrZ2 

This  factors  into  (1  + .25ZX  + .25Z2  + .5ZXZ2)(1  + ZZX  + 2Z2  - 8Z1Z2) 

2 2 

the  first  factor  factor  having  no  zeros  in  U , the  second  none  in  V ; re- 

2 

versing  the  second  factor  gives  a polynomial  without  zeros  in  U : 

B(Zi,Z2)  = (1  + .25Zj  + .25Z2.  + .5  ZiZ2) (-8  + 2Z2  + 2Zi  + ZiZ2) 

2 2 2 2 2 2 
= -8^-2ZiZ2  + .5  Z\  + .5Z2  + 1 .25ZjZ2  + 1.25ZiZ2  + .5ZjZ2, 

'V 

and  B has  the  same  amplitude  response  as  B. 
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In  order  to  gain  some  idea  of  the  stringency  of  the  conditions  in 
theorem  3,  let  us  consider  the  case  of  an  ideal  band-pass  filter.  By 
an  ideal  band-pass  filter  we  will  mean  a filter  whose  amplitude  response 
is  equal  to  1 on  some  subset.  A,  of  the  square  o<0i<2ir,  o<02<2 it,  and 
equal  to  K « 1 on  the  complement  of  A (of  course  this  specification  con- 
tinues over  the  whole  plane  by  periodicity).  This  of  course  is  not  the 
amplitude  response  of  any  rational  function,  but  in  practice  for  certain 
shapes  of  the  set  A,  one  may  wish  to  approximate  such  a response  by  a 
rational  function.  One  easily  sees  that  up  to  a scale  factor,  the  averages 
in  theorem  3 are  in  this  case  merely  the  fraction: 

length  of  the  line  lying  in  the  complement  of  A 
Total  length  of  the  line 

It  is  easily  seen  from  this  that  there  are  very  few  passband  shapes  of 
practical  interest  which  satisfy  even  the  first  of  these  conditions  (where 
n=l  and  m=l ) ; in  other  words,  very  few  which  can  be  accurately  approxi- 
mated by  transfer  functions  having  rational  spectral  factors.  (This  is 
not  to  imply  that  one  would  in  practice  be  restricted  to  such  filters;  the 
above  discussion  is  meant  solely  as  an  indication  of  the  severity  of  the 
restrictions  on  the  amplitude  of  such  filters). 

Finally,  we  remark  that  there  does  not  seem  to  be  any  difficulty  in 
extending  the  results  in  this  paper  to  higher  dimensions,  and  to  multi- 
dimensional systems  other  than  digital  filters. 
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APPENDIX 


The  converses  to  Theorems  1 and  3. 


These  converses  involve  some  technical  ideas  and  results  from  [2];  the 
most  important  ideas  are  those  of  inner  function  [2,p.105],  outer  function 
[ 2,p.  72],  Poisson  integral  [2,  p.17]  and  the  classes  N(U2)  [2,  p.44]  and 
N*(U2)  [2,  p.44]. 

We  will  also  use  the  following  notation  from  [ 2]  (Here  f is  an  analytic 
function  on  U ) : 

i.  f*(eJ01,  e-^02)  4 lim_  f(reJ01,  re^02  ) 

r -*■  1 " 

will  denote  the  radial  limit  of  f 

(this  is  clearly  consistent  with  our  previous  use  of  f*); 

2 

ii.  For  w = (wlt  w2)  e T , fw(Z)  will  denote  the  one-varible 
function  defined  by 

fw(Z)  4 f(Zwlt  Zw2) ; 

. 2 

iii.  if  p is  a function  defined  on  T which  is  absolutely  integrable 


there. 


P(m,n)4  jp 


2 IT  f2TT 


0 JO 


expf-jme!  -jne2)  0 (e x , 0 2 ) de2d6i 


will  denote  the  Fourier  coefficients  of  p. 

. 2 

iv.  For  any  function  f on  T , 

1 2 TT  (2  IT 

477  r(® i »®2/  de2dei  will  be  denoted 

f dm  or  ( $ (w)  d m (w) . 

JT2  J T2 


We  will  first  prove  the  converse  to  Theorem  1,  and  from  this  derive 
the  converse  to  Theorem  3.  First  of  all,  however,  we  need  the  following 
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lemma  (which  is  given  as  a problem  in  [2]). 

Lenina  A1 : 

If  ())  is  a real-valued  function  defined  on  T2 
such  that 

(p  £ L (T2)  (i.e.,  [ IHdrtX”) 

JT2 

and 

0(m,n)  = 0 for  mn  < 0, 

then  there  is  an  outer  function  f on  U2  such  that 
P[(p]  = log  1 f l 

(where  P[  ] denotes  "Poisson  integral  of"). 


Proof 


Let  a 


mn 


(j>(m,n)  (m,n)  f (o,o) 

1/2  $(m,n)  (m,n)  = (o,o) 


and  let 


9(Zi»Z2)  = E 


m=o  n=o 


a 7rn  7n 

3 L i Ly 

mn  1 z . 


This  series  clearly  converges  uniformly  on  compact  subsets  of  U2, 
and  so  defines  an  analytic  function  there. 


3 
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If  we  let  f = e^ 

2 

then  f is  analytic  in  U , and 


CO  CO 


log  | f | * £ £ amn  r1?  r£  exp  (jmej  + jne2) 


m=o  n=o 


OO  CO 


+ 1 z amn  ri  r2  exp  ”J n02 ) 

m=o  n=o 


GO  OO 


= E E $ (m,n)r|mlr|nUxp(jme1  + jne2  ) 
m=-°°  n=-°° 

P[<p]  [2..P-17] 

Next  we  prove  that  f is  outer;  we  have  (for  0<r<l ) 


Jog  | f ( rw ) ] dm  (w) 


_2  I log  I f(r»w)|  jdm(w) 
.2  lpC<P](rw)  | dm(w) 


T 

< OO 


| (p  ( w ) | dm(w) 


[2 ,Thm . 


and  so  f e N(lr). 

p 

Now  f*  exists  almost  everywhere  on  T [2,Thm.3.3.5]  and  log|f*| 
everywhere  on  [2,Thm.2.2.1];  thus  log  |f|  = P[log  |f*|]  and  so 
f e N*(U2)  [2,Thm.3.3.5],  and  log  |f(0)|  * f - log  |f*(w) |dm(w). 

Jr 


Thus  f is  outer. 


Q.E.D. 


We  can  now  prove  the  converse  to  Theorem  1 : 


2.1 .3(c)] 


(p  almost 


.J 
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Theorem  A2: 


Let  f(Z,  be  a rational  function  ($0),  and  let 
$ = log  |f*| 

If  (j>(m,n)  = 0 for  mn<0,  then  there  is  a rational  function  g without  poles 

o 

or  zeros  in  U such  that  |g*|  = |f*|. 

Proof: 


By  Lemma  A1 , there  is  an  outer  function  g such  that 
log  j g I = P[log|f|]. 


This  implies 


2 

log  |g*|  = log  |f*|  almost  everywhere  on  T . 


Therefore,  for  almost  all  w eT^ 


log  |g*  (Z) | = log  |f*  (Z)|  for  almost  all  ZeT  [2, Lemma  3.3.2], 

W W 

2 

and  g is  outer  for  almost  all  w eT  [2, Lemma  4.4.4]. 

W 

For  any  such  w,  let  Z-j , ...,  ZR  denote  the  poles,  and  Zp+1 , . ..  Zm  the 
zeros,  of  fw(Z)  in  U,  and  let 


m 

n 


f„(z)  - n i^k  

k=l  k=n+l  Z-Zk 


Zk  Z'1  f„(Z) 


Then  fw  has  no  poles  or  zeros  in  U and  is  rational;  hence,  f is  outer.  Since 
gw  is  outer,  we  have  fw/9w  is  outer.  Also  |fw*|  - |fw*|,  and  so  |fw*|  = |gw*| 
for  almost  all  ZeT.  Thus  fw/9w  is  inner.  But  a function  which  is  both  outer 
and  inner  is  a constant  of  modulus  1,  and  so 


q * eJ'1'  f 
aw  w 


for  some  real  f. 
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2 

Thus  gw  is  rational  for  almost  all  weT  , and  so  gw  is  rational  for  all 
2 

weE,  where  EcT  is  a compact  set  of  positive  measure  (by  the  inner 

regularity  of  the  measure).  It  follows  by  [2, Thm.  5.2.2]  that  g is 

rational  (since  the  vanishing  of  a polynomial  P on  a set  of  positive  measure 
2 

in  T would  imply 

log  |P*|  i lV2) 

and  so  P = 0.) 

Thus  g is  a rational  function  without  poles  or  zeros  in  U2,  and 

|g*|  = |f*|  almost  everywhere  in  T2 
and  so,  since  g and  f are  both  rational, 

|g*|  = |f*|  on  T2. 

Q.E.D. 

We  next  prove  the  converse  to  Theorem  3: 

Theorem  A3: 

Let  fU-j^)  be  a rational  function  (£  0)  and  let 
(p  = 1 og  j ( 

If  1 (2T 

2ir  (p  (me,  ne  + Y)de  is  a constant  independent  of  v for  each 

pair  (m,n)  with  m>o  and  n>o  then  there  is  a rational  function  g without 
poles  or  zeros  in  U2  such  that  |g*|  = |f*|. 
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Proof: 

Let  m > o , n > o , and  let  JL\  o be  an  integer . 

Then 


2tt 


o e 


jimt 


2 IT 


2^  fa11  . 

0 Jo  eJ' 


Jimf 


0(me,n9  +4,)de  d¥  = 0 


$ (me,ne  +f)ded4'  = 0 


Making  the  change  of  variables  defined  by 


9 = 1 0i 
m 

4/  = 92_  n 9i  i 
m 


we  get 


1 

m 


21T  r 0.9,  + 
f m 1 

ng  exp  (jjm92-  itng,)  0 ( 9i  ,92)d92d9i  = 0. 
m' 1 


and  since  the  integrand  is  periodic  in  9X  and  92 


2n 


f2  tt 


J J exP  (jime2  - jjen9 4 ) 0 (9 x ,92)d92d9!  = 0 
0 0 

and  so  0 (-in,  im)  = 0 for  all  Jl  f 0,  m > 0 and  n > 0, 
that  is. 


0(m,n)  = 0 for  all  m,n  with  mn  < 0. 

The  result  now  follows  from  Theorem  A2. 

Q.E.D. 


Finally,  we  note  that  if  f in  Theorem  A3  is  a polynomial,  then  the  converse 
in  Theorem  2 implies  that  f has  polynomial  spectral  factors.  Thus  we  have  the 
full  converse  of  Theorem  3 for  polynomials. 
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ABSTRACT 

Optimal  estimators  are  derived  for  a class  of  signal -dependent 
noise  processes.  Such  processes  are  of  interest  in  optics  because  cer- 
tain phenomena,  such  as  film  grain  strates  that  when  one  ignores  the 
presence  of  signal -dependent  noise  and  instead  assumes  only  signal-in- 
dependent noise  models,  the  resulting  estimators  may  pay  a severe  penalty 
in  performance.  This  "mismatch"  problem  is  explored,  with  the  results  of 
Monte  Carlo  simulations  of  the  performances  of  both  optimum  and  mismatched 
estimators  being  presented.  The  Cramer-Rao  lower  bounds  on  the  mean 
square  estimation  errors  for  unbiased  estimators  are  evaluated  and 
compared  with  the  lower  bounds  derived  for  the  signal -independent  noise 
case.  Overall,  the  results  indicate  that  improved  performance  will,  in 
most  cases,  offset  the  increased  complexity  inherent  in  estimators  de- 
signed for  the  signal -dependent  noise  model. 

Introduction 

In  contrast  to  the  signal-independent  additive  noise 

models  traditionally  encountered  in  statistical  communi- 
1 2 

cation  theory  ' , many  physical  noise  processes  are  in- 
herently signal-dependent.  Common  examples  from  optical  pro- 
cessing include  film  grain  noise,  encountered  in  image  pro- 
cessing, and  photoelectronic  shot  noise,  which  is  some- 
times dominant  when  imaging. at  low  light  levels  with  photo- 

3 4 

emissive  detectors.  ' An  example  of  a non-optical  noise 
source  which  is  effectively  signal-dependent  is  magnetic  tape 
recording  noise.5  A study  of  these  particular  examples 
indicates  that  studies  of  optimum  estimation  in  signal- 
dependent  noise  processes  would  have  applications  to  a broad 
class  of  signal  processing  problems  in  modern  optics  and  in 
other  fields. 
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To  date,  the  majority  of  the  work  dealing  with  signal- 

dependent  noise  has  been  concentrated  on  rather  specialized 

examples  and  applications.  Using  a Poisson  point  process 

noise  model,  Goodman  and  Belsher  ® have  considered  the 

restoration  of  atmospherically  degraded  images  using  linear 

4 

minimum  mean  square  error  filters.  Walkup  and  Choens 
modified  the  familiar  Wiener  filter  for  various  additive, 
Gaussian  signal -dependent  noise  models,  and  Naderi7  has 
done  considerable  additional  work  on  this  problem.  Addi- 

Q ... 

tionally.  Hunt  has  derived  a nonlinear  maximum  a posteriori. 

(MAP)  estimator,  based  on  a different  model  than  the  one 

considered  here,  which  can  accomodate  both  signal-dependent 

and  signal-independent  noise  cases,  and  they  have  applied 

this  MAP  estimator  to  restoring  noise-degraded  images. 

For  such  applications,  and  in  the  special  case  where 

the  images  of  interest  exhibit  extremely  low  contrasts, 

conventional  restoration  techniques  perform  rather  poorly. 

Thus,  heuristic  algorithms,  such  as  the  so-called  "noise 

g 

cheating"  algorithm  for  film  grain  noise  suppression  , 
have  been  developed.  Other  algorithms,  which  explicitly 
include  the  signal  dependence  of  the  noise,  as  well  as  in- 
corporating pertinent  properties  of  the  human  visual  system, 
have  also  been  investigated7,10'11. 

The  purpose  of  this  paper,  then,  is  twofold.  First, 
several  fundamental  properties  of  signal-dependent  noise  are 
investigated  in  order  to  better  understand  when  consideration 
of  signal-dependence  is  warranted  and  when  it  cam  be  ignored. 
To  this  end,  the  mean  square  estimation  error  is  first  con- 
sidered for  both  the  signal-dependent  and  signal- independent 
cases.  In  addition,  the  mean  square  estimation  error  for  a 


mismatched  case  is  evaluated. 


The  mismatch  case  considered 


is  one  in  which  the  signal -dependent  measurement  model  is 
valid  but  is  ignored  for  purposes  of  simplification.  Secondly, 
optimal  estimators  are  derived  for  several  cases  of  both  signal- 
dependent  and  signal-independent  models.  The  Cramdr-Rao  lower 
bound  on  mean  square  estimation  error  is  also  determined,  in 
order  to  find  the  lowest  error  possible  for  both  signal-dependent 


and  signal-independent  estimators.  The  results  of  Monte 
Carlo  simulations  of  the  performance  of  the  various  optimal 
estimators  previously  derived  are  presented  for  several 
values  of  the  model  parameters  and  for  various  prior  signal 
probability  densities. 


Problem  Statement 

To  motivate  the  investigation  of  signal-dependent  noise 
processes,  it  is  necessary  first  to  define  the  models  to  be 
used.  The  signal-dependent  measurement  model  to  be  used  is 
given  by 


r = s + kf(s)n1  + n2,  (1) 

where  n^  and  n2  are  signal-independent  random  noise  processes; 
s is  the  underlying  signal  to  be  estimated  which  is  assumed  to 
have  probability  density  p(s) ; n^,  n2,  and  s are  assumed 
mutually  statistically  independent;  f(s)  is  any  function  of 
the  signal;  k is  a scalar  constant;  and  r is  the  noisy  measure- 
ment. The  signal-dependent  noise  term  in  Eq.  (1)  is,  of 
course,  the  term  kf(s)n^.  It  is  often  physically  reasonable 
to  assume  that  both  n^  and  n2  are  zero  mean  and  have  unimodal 
probability  densities.  Further,  note  that  substitution  of 
k ■ 0 in  Fq.  (1)  yields 
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which  is  just  the  familiar  textbook  additive,  signal- 
independent  noise  model^'2.  In  both  Eq.  (1)  and  Eq. 

(2) , the  arguments  of  all  of  the  variables  have  been  dropped 
for  simplification.  It  should  be  remembered  that  these 
arguments  may  depend  on  time,  position,  or  both. 

It  will  be  shown  repeatedly  that  the  model  of  Eq.  (2) 
yields  far  simplier  estimators  than  does  Eq.  (1) , as  would 
be  expected.  The  following  example  serves  to  illustrate 
why  it  may  prove  worthwhile  to  employ  the  more  complex 
estimators  resulting  from  Eq.  (1) . 

When  the  observations  are  actually  of  the  type  given 

2 

by  Eq.  (2) , it  can  be  shown  that  simply  using  the  re- 
ceived value  as  the  estimate  results  in  a minimum-variance 
unbiased  estimate,  i.e.. 


§ - r,  (3) 

where  the  circumflex  denotes  the  estimate.  The  average  error 
is  then  given  by 

E{3  - s}  - E{r  - s>  = E{n2J  = 0.  (4) 

The  estimator  is  said  to  be  unbiased  since  the  mean  error 
is  zero.  A measure  of  the  performance  of  this  estimator, 
conditioned  on  the  signal  value,  is  given  by  the  conditional 
mean  square  error,  and  is  found  to  be 


E { (§  - s ) 2 | s } = E{n2}  =*  a2, 


(5) 
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which  is  simply  the  variance  of  the  additive  noise  process 
n2*  This  estimator  is  obviously  simple  from  an  implementation 
point  of  view. 

With  this  in  mind,  consider  a case  in  which  the  observations 
are  actually  of  the  type  given  by  the  signal-dependent  model  of 
Eq.  (1) . For  ease  of  implementation  it  is  decided  to  use  the 
estimate  given  in  Eq.  (3) , which  was  designed  for  the  signal- 
independent  noise  process.  This  represents  a mismatched  situ- 
ation, where  an  estimator  based  upon  an  incorrect  measurement 
model  (corresponding  to  ignoring  the  signal-dependency)  is  used. 
Once  again,  the  average  estimation  error  is  zero,  due  to  n^ 
and  n2  being  assumed  zero  mean  and  to  the  assumed  mutual  statis- 
tical independence  of  n^ , n2 , and  s. 

However,  assuming  § = r,  the  mean  square  estimation  error 
for  this  mismatched  case  is  given  by 

E{ (§  - s ) 2 | s } = k2a2  E{[f(s)]2}  + a2  , (6) 

For  convex  f(s),  i.e.  f"  (s)  >_  0 for  all  s,  Jensen's  inequality 
states  that  E{f(s)}  >_  f[E{s}]^,  where  E { * } denotes  the 
expected  value.  This  inequality  may  be  used  to  find  a lower 
bound  for  the  mean  square  estimation  error  for  the  mismatched 
case.  Thus,  recalling  Eqs.  (5)  and  (6) , 

c2  < k2a2{f [E (s) ] }2  + a2  < k2a2  E{ [f (s) 1 2 > + a2. 

(7) 


I 
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Note  that  this  gives  a lower  bound  (the  middle  term)  on  the 
mean  square  estimation  error  of  the  mismatched  estimator, 
and  that  this  bound  contains  a function  of  the  signal’s 
mean.  The  leftmost  term  of  Eq.  (7)  is  the  mean  square 
estimation  error  given  by  Eq.  (5) . Note  that  the  mismatched 
mean  square  estimation  error  is  in  general  greater  than  the 
error  for  the  same  estimator  when  used  in  the  presence  of 
signal-independent  noise.  We  next  consider  an  illustration 
of  the  significance  of  Eq.  (7) . 

A commonly  used  model  in  image  processing  when  the 
observed  quantity  is  the  photographic  density  is  given  by 
Eq.  (1)  with  f(s)  given  by  sp.4'10  From  Eq.  (6),  then, 
the  mismatched  mean  square  estimation  error  becomes 

E{(§  - s ) 2 [ s } = k2  a2E{s2p}  + c2  , (8) 

where  k is  a scanning  constant  relating  the  scanning  aperture 

area  to  the  mean  area  of  a film  grain.  A typical  value  of 

p used  for  characterizing  photographic  film-grain  noise  is 

4 10 

p * 1/2,  though  p * 1/3  has  also  been  used.  ' Thus, 

Eq.  (8)  becomes  (for  p * 1/2) 

E{(§  - s ) 2 | s } * k2o2E(s}  + a2,  (9) 

which  is  greater  than  the  variance  of  Eq.  (5)  by  the  addition 
of  a term  which  is  proportional  to  the  signal  mean.  Note 
that  in  the  particular  case  of  p * 1/2,  the  equality  holds 


between  the  last  two  terms  in  Eq.  (7) , but  that  for  general 
p this  is  not  the  case.  Here,  the  lower  bound  on  the  mean 
square  estimation  error  given  by  Eq.  (7)  becomes 

E{(3  - s ) 2 | s } > k2o2[E(s)]2p  + a2.  (10) 

The  lower  bound  given  by  Eq.  (10)  may  be  visualized 

with  the  aid  of  Figs.  1 and  2,  for  various  values  of  k,  p, 

and  E(s).  In  all  cases,  the  plane  upon  which  the  surfaces 

rest  is  not  the  zero  plane,  but  rather  represents  a height 
( 2 

of  c^,  the  leftmost  term  of  Eq.  (7),  which  results  when  the 

[ 

estimator  of  Eq.  (3)  is  properly  matched  (to  the  signal- 

independent  noise  process  of  Eq.  (2)).  In  Fig.  1,  p is 

2 2 

fixed  at  a value  of  1/2,  and  are  set  equal  to  1 for 
illustration,  with  k and  E(s)  being  varied.  In  Fig.  2,  k 
is  fixed  (at  k = 1/2)  and  p is  varied.  It  should  be  noted 
that,  for  film  grain  noise  applications,  common  values  of  k 
are  in  the  range  of  from  about  0.3  to  about  0.7.  These 
figures  illustrate  the  marked  deviation  from  the  variance 
achieved  by  a properly  matched  signal-independent  estimator. 
Also,  it  should  be  remembered  that  these  surfaces  represent 
lower  bounds  on  the  mean  square  estimation  error  of  the  mis- 
matched estimator,  and  there  is  no  guarantee,  in  general, 
that  even  this  measure  of  performance  can  be  achieved.  Thus, 
optimal  estimators  based  on  the  proper  noise  model  are  needed. 
These  estimators  are  derived  in  the  following  sections. 


122 


MAP  Estimation 

An  appropriate  optimal  estimate  when  the  signal  is 
random  and  its  probability  density  function  is  known  a 

2 

priori  is  the  maximum  a posteriori  probability  (MAP)  estimate. 
This  estimate,  is  defined  to  be  that  value  of  s 

which  maximizes  the  a posteriori  density  p(s|r).  In  other 
words,  given  the  observation  r,  the  signal  value  maxi- 

mizes the  probability  of  that  value  of  r having  been  re- 
ceived. Maximizing  p(sjr)  is  equivalent  to  maximizing 
p(r|s)p(s),  or  alternately  the  logarithm  of  this  product. 

This  follows  from  the  facts  that  (a) 

p (s  |r)  - . (ID 

(b)  the  denominator  is  not  a function  of  s,  and  (c)  because 
monotonic  transformations  (such  as  the  logarithm)  preserve 
maxima  and  minima. 

As  an  example  of  the  calculation  of  a MAP  estimate, 

assume  that  n^  and  n2  are  both  zero  mean,  normally  distrib- 

2 2 

uted  random  variables  having  variances  and  o2,  respectively. 
In  this  case,  the  conditional  probability  density  p(r|s)  is 
also  normal,  with  a mean  of  s and  variance  v(s) , given  by 

v ( s ) » k2a2[f(s)]2  + c2.  (12) 

It  can  then  be  shown  that  the  MAP  estimate  is  a solution  of 
the  equation 
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(r  " §MAPJ  V'  (SMAP}  + 2(r  ^MAP^^MAP*  “ v' (§MAP}  V (§MAP* 


MAP'  'MAP' 


+ 2[v(SMAP^  33  £n  P(SMAP)  ~ °' 

MAP 


where  the  prime  denotes  the  partial  derivative  with  respect 


to  3. 


For  the  class  of  situations  where  f(s)  = sp,  and  assum- 


ing s is  distributed  normally  with  mean  u and  variance 

s 

2 

as,  Eq.  (13) , the  MAP  equation  becomes 

, 4k2o2o2  , ..  , - 4k2c?o2u  , 

t2k2a2(p-l) 1 Sm  - I2rk2a2(l-2p)  + / 2 s 


- [2a2  + [2pk2a2(r2-a2)]2^1 


2o2us, 


+ [2a‘r  ♦ -iji]  - [2pk4oJ)8jP;1  + [ 


4 4 

2k *g’u  . 

• La,  a4p 


2k4o?  . .. 

_ r L §4P+1  * o 

1 a2  JSMAP  U' 


The  MAP  estimate,  , is  a solution  of  Eq.  (14) . For  the 


specific  case  where  p ■ 1/2,  Eq.  (14)  reduces  to  the  cubic 


equation 


,v4  4 

2^  °1  3 

r a J 


2 2 2 4 4 

4k2o2a2  - 2k4ajus 


t— + t ~ o — + 2k  of ] 


M>82ap 


2o4  - 4k2a2a2u  . . 2 

+ [ 2 + k + 2a2^MAP 

o 

s 

+ [k2o2(a2  - r2)  - 2a2r  - ] = 0. 

a 
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Substitution  of  k = 0 into  Eq.  (14)  or  Eq.  (15)  yields 
the  MAP  estimate  for  the  signal-independent  noise  case  of 
Eq.  (2) , namely 


2 2 
°s  + °2 


2 2 Us* 
cs  + a2 


Comparison  of  Eqs.  (14)  and  (16)  demonstrates  the  much 
greater  complexity  of  the  estimator  structure  when  signal- 
dependent  noise  processes  are  taken  into  account. 

A shortcoming  of  the  Gaussian  density  as  a model  for  p(s) 
is  that  the  probability  of  a negative  value  of  s is  nonzero, 
regardless  of  E(s),  and  often  this  can  be  physically  impossible 
(for  example,  when  s represents  photographic  density) . However, 
one  common  probability  density  of  interest  for  some  classes 
of  images  is  the  Rayleigh  density,13  i.e.. 


p(s)  = -f  exp  [ ~]  , s > 0, 

a 2c* 


which  has  a positivity  constraint. 

Substitution  into  Eq.  (13),  the  MAP  equation,  with  f(s)  = 
as  before,  yields 

l2k2°l,p-v  - ^ 1S-  + [2^r(1-2pm- 

7 4 ~ 2 

2^  2,a2  . ,22  , 2 2 A *°2,,A2p 

~f2  2 2JSMAP  t2pk  l(r  2 p )1§MAP 


+[2o2rl3MAP  - t2*M<e-1)ISw!p  + l202] 


51r4  4 

,2k  ^l]54P+2 
‘ SMAP 


125 


This  equation  is  quite  similar  to  Eq.  (14)  with  y =0,  but 


each  term  is  greater  in  Eq.  (18)  by  degree  one.  Thus,  when 


p = 1/2,  the  MAP  estimate  3 


MAP 


is  a solution  of  the  quartic 
.2 


4 4 2 4 

2k  o.  , _ _ 4a-  , _ 2a  . _ 

[ 2 ^ ^MAP  + al^  + 2 ^ ^MAP  + ^2a2  + 2 ~ k al^MAP 

a a a 


[k2a2  (r2  + 3a2)  + Sra2]^^  - 2a4 


(19) 


As  before,  substitution  of  k = 0 into  Eq.  (19)  then  yields  the 
MAP  estimate  for  the  signal-independent  noise  case,  given  by 


MAP 


a2  + a 


[4a2a,(a2  + a2)  + r2a4]1/2 

y r + - ^ — = . (20) 

\ 2 (a^  + o\) 


Again  this  is  less  complex  than  the  MAP  equation  of  Eq.  (18)  , 
but  it  is  not  as  simple  as  the  solution  for  the  signal-inde- 
pendent noise  model  with  a normal  probability  density  for  s. 

Another  probability  density  with  a positivity  constraint 
is  the  folded  normal,  that  is,  the  absolute  value  of  a 
normally  distributed  random  variable.  Its  probability  density 
function  is  given  by 


p(s)  - [pN(s)  + PN(-s)]u(s),  (21) 

where  u(s)  is  the  unit  step  function  and  pN(s)  is  the  normal 

probability  density  function.  After  much  manipulation,  it 

may  be  shown  that  the  MAP  estimate  for  this  case  is  given 

by  the  value  of  which  satisfies 

MAP 


is  substituted  into  Eq.  (22)  to  obtain 


_ ,2US§MAI\  ,US  §MAP  , r_§MAP,  rMS+§MAP  r"§MAP,  _ „ 
exp  ( = — ) [ = + ^ — ]-[ 5 5 — ] = 0. 


2a. 


2a: 


(23) 


Neither  of  these  equations  lend  themselves  to  straightforward 
solution:  however,  it  is  once  again  obvious  that  the  signal- 
independent  noise  model  yields  a much  simpler  solution. 

ML  Estimation 


Another  commonly  used  estimator  is  the  maximum  likelihood 
2 

(ML)  estimator.  The  ML  estimate  is  employed  when  no  prior 
knowledge  of  the  signal  is  assumed,  and  it  is  found  by  maxi- 
mizing p(r|s)  over  s.  In  other  words  find  a value  of  s,  such 
that  given  s,  the  most  probable  observation  r which  would 
result  is  the  value  observed.  Using  the  signal -dependent 

measurement  model  of  Eq.  (2) , and  still  assuming  n^  and  ^ 

2 

are  zero  mean  normal  random  variables  with  variances  a^  and 
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a ^ , respectively,  the  ML  estimate,  3^,  is  a solution  of 
the  equation 

(23) 

where  v(^MIj)  and  v'  (Sj^)  are  as  defined  previously.  Again, 
considering  the  special  case  f(s)  =«  s?,  the  ML  equation 
becomes 

[2k2a2(p-l)]§£P+1  + [2rk2a2(l-2p) ] 3^  + [2pk2a2 (r2-a2) ] s^-1 


-[202]^^  - [2pk  a 1 1 + 2a2r  “ 


(24) 


This  equation  is  at  worst  no  more  complex  than  the  MAP 
equation,  Eq.  (14).  In  fact,  for  p = 1/2,  Eq.  (24)  becomes 
the  quadratic  equation 


(k2a2)  + [2a2  + + [k2a2(o2  - r2)-2o2r]  = 0. 


(25) 


This  has  as  its  positive  root  the  ML  estimate 

[r2  + (k2al  )2  t 2ra2  , , a2  ,21/2 

1 ( 2 } k2a2  k2a2  1 

k2a2  a2 


ML 


(26) 


The  ML  estimate  for  the  signal-independent  model  of  Eq.  (2) 
is  found  by  letting  k * 0 in  any  of  Eqs.  (23)  to  (25)  , and 
is  given  by 
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^ - r,  (27) 

Note  that  this  is  the  minimum  variance  unbiased  estimate 
used  in  Eq.  (3)  for  the  mismatched  example,  for  which  we 
earlier  found  the  mean  square  estimation  error. 

Another  point  worthy  of  note  is  the  similarity  between 
Eq.  (13) , the  general  MAP  equation,  and  the  ML  equation, 

Eq.  (23) . These  expressions  differ  only  by  an  additional 
term  in  Eq.  (13) , and  it  is  this  term  which  contains  all  of 
the  prior  knowledge  about  s.  This  term  vanishes  when  £np(s), 
and  hence  p(s),  is  constant.  In  other  words,  if  s is  distrib- 
uted uniformly  over  all  of  its  space  of  definition  (a  worst 
case) , then  knowledge  of  its  value  in  no  way  affects  the  maxi- 
mum of  p(s|r)  = p(r|s)p(s)  = c p(r|s).  Thus,  the  ML  estimator 
can  be  viewed  as  a worst  case  of  the  MAP  estimator.  Because 
the  MAP  estimator  embodies  a priori  information  about  s that 
is  not  present  in  the  formulation  of  the  ML  estimate,  it  would 
seem  reasonable  to  assume  that  the  MAP  estimate  would  exhibit 
a smaller  mean  square  estimation  error  than  the  ML  estimate. 

It  will  be  seen  that  this  is  indeed  the  case.  In  the  next 
section,  bounds  on  the  variances  of  these  estimates  will  be 
found. 

Cramer-Rao  Lower  Bounds 

A well-known  lower  bound  on  the  variance  of  any  unbiased 
estimate  for  a fixed  but  unknown  s is  the  Cram6r-Rao  error 
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bound.  Given  the  conditional  density  p(r|s),  the 
Cramer-Rao  bound  is  given  by 


var  (3-s  | s]  > {-e[3  tnP  (£L8J-]  }~L 

3sZ 


For  n^  and  n2  normal  with  zero  mean,  Eq.  (28)  reduces  to 


var[§-s|s]  > 


2 [v  (s)  ] 
2v(s)  + [v'  (s)  ] 


where  v(s)  and  v' (s)  are  as  given  by  Eq.  (12).  For  the 
signal-independent  noise  model,  which  is  the  result  of  let- 
ting k = 0 in  Eq.  (29) , the  Cramer-Rao  bound  is  given  by 


var [§  - s | s]  > ^2' 


which  is  the  variance  actually  achieved  by  the  ML  estimate 
of  Eq.  (27)  for  the  signal-independent  noise  case.  When 
equality  holds  in  Eq.  (28)  , the  estimate  § is  said  to  be 
efficient  (2).  Thus,  the  signal-independent  ML  estimate 
is  efficient  when  the  measurement  is  actually  of  the  form 
given  by  Eq.  (1) . 

For  f(s)  = sp,  as  before,  Eq.  (29)  becomes 


var[§-s|s]  > 


k4aJs4p+2k2cr2a2s2p+o4 

kia2s2p+2p2k4a4s4p_1+c 


which  for  p = 1/2  reduces  to 

k4a4s2+2k2a2a2s  + a4 

var  [§-s  | s]  i—^2 T1 T 

(k^c^+l/2k4oJ)s  + a 2 
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Although  it  is  not  obvious  by  inspection,  the  bound  given 

by  Eq.  (31)  may  actually  be  smaller  than  the  bound  given 

by  Eq.  (30) . In  other  words,  there  are  potentially  cases  where 

the  estimators  designed  for  the  signal-dependent  measurement 

model  may  actually  outperform  (in  a mean  square  estimation 

error  sense)  the  estimators  designed  for  the  signal-independent 

measurement  model.  To  better  illustrate  this,  Eq.  (31)  is 

plotted  in  Figs.  3 and  4.  In  the  first  of  these  k is  fixed  at 
2 2 

1/2,  and  a 2 at  one,  and  s and  p are  varied.  As  in  Figs.  1 

and  2,  the  plane  upon  which  the  surface  rests  is  not  the  zero 

plane,  but  rather  is  the  Cramer-Rao  lower  bound  given  by  Eq. 

2 

(30),  namely  o 2-  In  Fig.  4,  p is  fixed  at  1/2  and  kc^  is 
allowed  to  vary.  Now  it  is  worth  noting  that  in  all  of  the 
previous  equations,  when  k^O,  k and  a.,  always  appear  to- 
gether. Thus  varying  ko^  is  tantamount  to  fixing  either 
one  and  varying  the  other.  Note  that  in  Fig.  4,  for  certain 
values  of  kcr^  and  s,  the  Cramer-Rao  bound  of  Eq.  (31)  dips 

below  the  Cramer-Rao  bound  of  Eq.  (30)  , that  is,  it  dips 

2 

below  the  plane  o2.  This  is,  of  course,  the  region  mentioned 
above,  where  the  inclusion  of  signal -dependence  in  the 
measurement  model  may  potentially  result  in  improved  esti- 
mator performance.  The  values  of  s and  ko^  which  result  in 
this  region  are  given  by 

°2 

0 > s < — §■  [1  - -t 5-3  , (33) 

(kc^r 


■sauKss, 
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where  ko^  must  then  satisfy 

kax  > /2  . (34) 

Recall  that  these  equations  are  derived  for  the  p = 1/2 
case. 

To  get  a feeling  for  the  actual  mean  square  estimation 
error  achieved  by  the  estimators  derived  above,  Monte  Carlo 
simulations  were  performed,  with  the  results  presented  in 
the  next  section. 

Monte  Carlo  Simulations 

The  performance  of  each  of  the  estimators  derived  in 
the  previous  sections  was  evaluated  by  Monte  Carlo  simulations 
to  determine  the  mean  square  estimation  error.  The  results 
for  each  of  the  various  signal  probability  densities  were  so 
similar  that  only  one  case  is  presented.  The  Gaussian  case 
was  chosen  since,  for  the  MAP  estimate,  it  represents  the 
minimum  achievable  mean  square  estimation  error  (see  Appendix) . 
Figure  5 shows  the  mean  square  estimation  error  (MSEE)  of  the 
MAP  estimate  plotted  as  a function  of  the  signal  mean  E(s). 

In  Fig.  5a,  ka^  » 1,  while  in  Fig.  5b,  ka1  * 2.  The  solid 
line  is  the  MSEE  for  the  MAP  estimator  of  Eq.  (15)  and  the 
dashed  line  is  the  MSEE  for  the  mismatched  case,  that  is, 
for  the  MAP  estimate  of  Eq.  (16)  when  applied  to  the  signal- 
dependent  measurement.  Inclusion  of  signal-dependence  in 


the  estimator  structure  is  seen  to  yield  estimates  of  the 
signal  which,  on  the  average,  have  smaller  error  than  would 
be  the  case  when  signal-dependence  is  ignored.  It  should 
be  noted  that  for  sufficiently  small  ka^  and  small  signal 
means  the  signal-dependent  noise  term  is  negligible.  This 
results  in  the  estimates  for  the  mismatched  case  being  very 
nearly  equal  to  those  which  include  the  signal-dependence. 

Figure  6 presents  the  results  of  simulations  of 
the  ML  estimators.  As  before,  the  solid  line  represents  the 
signal-dependent  estimator  MSEE  and  the  dashed  line  repre- 
sents the  MSEE  for  the  mismatched  case.  Once  again,  in- 
clusion of  signal-dependence  is  seen  to  yield  better  esti- 
mates on  the  average.  Since  the  ML  estimates  include  no 
prior  knowledge  of  the  signal  statistics,  their  performance 
is  markedly  inferior  to  the  MAP  estimates,  but  as  previously 
discussed,  the  ML  estimate  represents  a worst  case.  As 
before,  for  small  ka^  and  small  E(s),  the  estimates  are 
very  nearly  equal  regardless  of  the  inclusion  of  signal- 
dependence  in  the  estimator  structure. 

Conclusion 

Many  physical  processes  are  described  by  a signal- 
dependent  observation  model.  It  has  been  shown  that,  in 
such  cases,  ignoring  the  signal-dependence  for  purposes 
of  designing  estimators  of  the  signal  may  result  in  severe 
penalties  in  terms  of  estimation  error.  Therefore,  optimal 
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estimators  which  include  the  signal-dependent  structure 
were  derived.  Specifically,  these  were  ML  estimates,  which 
include  no  prior  knowledge  of  signal  statistics,  and  MAP 
estimates,  which  assume  prior  knowledge  of  the  signal  prob- 
ability density.  The  latter  estimate  was  derived  for  the 
Gaussian,  Rayleigh,  and  folded  Gaussian  density  functions. 

The  performance  of  these  estimators  was  then  investigated 
by  Monte  Carlo  simulation.  As  expected,  inclusion  of  signal- 
dependence  in  the  estimator  structure  resulted  in  improved 
estimator  performance. 

Appendix 

Bayesian  estimators  are  those  estimators  which  serve  to 
minimize  the  Bayes  risk,  where  the  Bayes  risk  is  the  expected 
cost  of  estimation  based  on  some  cost  function.  For  example, 
minimum  mean-square  error  is  achieved  when  the  cost  is  pro- 
portional to  the  square  of  the  estimation  error,  i.e.,  when 
the  cost  function  is  a parabola.  The  MAP  estimator  is  a 

Bayesian  estimator  based  on  the  uniform  cost  function  shown 
2 

in  Fig.  7.  The  cost  for  no  error  is  zero  (as  it  is  for  some 

A region  about  no  error) , and  the  cost  of  any  other  error  is 

uniform  (all  errors  are  weighted  equally) . 

2 

It  can  be  shown  that,  under  certain  conditions,  the 
optimal  Bayes  estimate  is  invariant  for  a variety  of  cost 
functions,  and  is  equal  to  the  minimum  mean-square  error 
estimate.  These  conditions  are:  (1)  the  cost  function  is 
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convex,  (2)  the  cost  function  is  symmetrical,  (3)  the  a 

posteriori  probability  density,  p(s|r),  is  symmetrical, 

and  (4)  lim  C(s)p(s|r)  = 0,  where  C(s)  is  the  cost  function 

with  argument  s.  Condition  (4)  is  simply  a requirement  that 

the  a posteriori  density  goes  to  zero  faster  than  the  cost 
. 14 

function  increases.  Viterbi  has  shown  that  the  uniform 
cost  function  satisfies  these  conditions.  When  the  prior 
signal  density,  p(s),  is  assumed  Gaussian,  then  clearly 
p(s|r)  is  symmetrical,  as  required  in  condition  (3).  Thus, 
for  this  case  we  have  the  optimal  Bayesian  estimate,  and  it 
is  the  estimate  which  yields  the  minimum  mean-square  estimation 
error. 
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Figure  5.  Mean  square  estimation  error  for  the  MAP  estimator, 

as  a function  of  the  signal  mean  E(s),  with  a)  ka.  = 1 
and  b)  ka^  * 2.  The  solid  line  is  the  signal-dependent 
estimator  error  and  the  dashed  line  is  the  mismatched 
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Figure  6.  Mean  square  estimation  error  for  the  ML  estimator, 

as  a function  of  the  signal  mean  E(s),  with  a)  ka,=“l 
and  b)  ko,«2.  The  solid  line  is  the  signal-dependent 
estimator  error  and  the  dashed  line  is  the  mismatched 
estimator  error. 
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Abstract 

A survey  of  ongoing  research  into  the  existence  in- 
variants and  relative  invariants  for  application  in  pattern 
recognition  is  presented-  A mathematical  formalism  is  de- 
veloped and  a complete  characterization  of  the  relative 
invariants  is  given. 

Introduction 
— ■ ■ — — — — 

The  importance  of  group  theory  as  a tool  to  be  exploited  in 
modelling  a variety  of  perceptual  phenomena  has  been  demonstrated 
by  a number  of  writers  [2,11,12,18].  Although  the  influence  of 
group  theory  is  implicit  in  much  of  the  literature  on  pattern 
recognition  [1,8,13,16,20],  relatively  few  instances  can  be  found 
in  which  explicit  utilization  of  group  theory  is  the  central 
theme  [4,6,7,19],  Without  exception,  group  theory  has  been  used 
to  effectively  model  some  aspect  or  feature  which  is  invariant 
under  transformation  and  to  exploit  this  invariance  in  performing 
the  recognition  function.  However,  no  definitive  study  has  been 
made  of  transformational  invariance  and  no  general  model  has  been 
introduced  which  attempts  to  formalize  the  concept  of  invariance 
as  it  relates  to  pattern  recognition.  This  is  indeed  strange  in 
view  of  the  relatively  advanced  state  of  the  theory  of  invariants 
within  group  theory  [10,15,21]. 

In  the  following  we  formulate  a general  model  in  which  many 
problems  in  pattern  recognition  may  be  cast  in  a natural  fashion. 
We  discuss  representations  of  patterns  as  functions  defined  on  a 
group  and  proceed  to  investigate  the  existence  of  invariant 


functionals . 


The  Model 


Let  ft  denote  a set  of  objects  called  patterns  and  assume 
that  G is  a group  of  transformations  which  act  on  ft  on  the  left. 

For  weft  and  g e G we  denote  by  gw  the  image  of  w under  the  trans- 
formation g.  Also,  for  9±'<32^  G we  denote  their  product  or  com- 
position by  Action  on  the  left  is  then  given  by  the  identity 

(gxg2) w - gx  (g2w> » d) 

for  g1,g2fi  G and  w 6 ft. 

We  now  assume  that  our  ability  to  "recognize"  and/or  otherwise 
"classify"  patterns  is  obtained  via  measurements  performed  upon 
individual  patterns.  Such  measurements  can  take  values  of  a quite 
general  nature,  although  the  usual  situation  will  result  in  a 
vector  of  real  numbers.  Accordingly,  we  define  a measurement 
function  to  be  a mapping  R:  ft  -*■  V,  where  V is  a suitable  set  of 
permissible  values.  We  shall  later  assume  that  V is  a real  finite- 
dimensional vector  space.  We  say  that  a measurement  function 
R:  ft  + V is  invariant  provided  that  R(gw)  = R(w)  for  all  w€  ft 
and  g£G.  Observe  that  an  invariant  measurement  does  not  dis- 
tinguish between  the  various  members  of  an  orbit  [w]  = {gw|geG}, 
being  constant  on  each  such  orbit. 

More  generally,  we  say  that  a measurement  R:  ft  + V is  rela- 
tively invariant  provided  that  R(gw)  = p(g)R(w).  Here  p is  a 
homomorphism  of  G into  a group  of  transformations  on  V and  is 
called  the  modulus  of  R.  As  a matter  of  practice,  we  are  interested 
in  the  case  in  which  V is  a finite  dimensional  vector  space  and  p 
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is  a representation  of  G in  the  group  GL(v)  of  invertible  linear 
transformations  on  V.  Note  that  a relative  invariant  not  only 
depends  upon  the  orbit  of  w but  is  also  sensitive  to  "position" 
within  the  orbit. 

In  applications  one  must  solve  simultaneous  equations 

R (w)  = Rm  involving  a number  of  invariants  {R  } and  associated 

a cl  a 

actual  measurements  {R™}  to  classify  the  orbit  of  w and  then 
solve  similar  equations  Pg(g)Rg(w)  = R™  involving  relative  in- 
variants to  determine  position  within  the  orbit.  Hence  we  see 
that  the  question  of  existence  of  invariants  and  relative  in- 
variants become  of  paramount  importance. 

Representations 

In  order  to  pursue  the  question  of  existence  of  invariants 
we  find  the  need  of  considerably  more  structure  than  we  have  as- 
sumed at  this  point.  It  is  somewhat  surprising  that  this  addi- 
tional structure  can  be  imposed  on  the  transformation  group  and 
need  not  involve  restrictive  assumptions  about  the  space  of 
patterns.  Since  the  transformation  groups  that  are  typically 
encountered  are  quite  rich  in  structure,  we  find  ourselves  in  an 
advantageous  situation. 

Let  us  briefly  digress;  Suppose  that  X is  any  set  and  that 
the  group  G acts  on  X on  the  left.  Let  f:  X -*■  Y be  a mapping  of 
X to  some  set  Y.  Then  for  any  g G we  may  define  a new  mapping 
gf:  X -*■  Y given  by 


(gf)  (x)  = f (g_1x)  , xe  X. 


(2) 
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Note  that  appearance  of  g \ rather  than  g,  is  a convenience 
which  makes  certain  formulae  more  natural  for  later  use.  We 
easily  verify  that 

gl(g2f)  = (9i92)f  (3) 

so  that  if  F is  a set  of  functions  such  that  gf  £ F for  all  f^F, 
then  (2)  defines  an  action  of  G on  the  left  of  F. 

Now  let  R:  12  -*■  V be  a given  measurement  function.  For  each 
w £ 12  we  may  define  a function  wr:  12  V as  follows: 

wr  (x)  = R(x_1w)  , x e G (4) 

The  correspondence  r:  w -*■  wr  thus  defines  a mapping  of  12  into  the 
set  F(G,V)  of  functions  from  G to  V.  Now  for  fixed  g € G we  see 

that  for  all  x £ G,  (gwr)  (x)  = wr(g  ^x)  = R((g-1x)_1w)  = 

R( (x  1g)w)  = R(x  ^ (gw) ) = [(gw)r](x).  That  is, 

gwr  = (gw)r,  (5) 

for  all  w€n  and  g€G.  Equation  (5)  establishes  the  desired  con- 
nection between  our  patterns  and  the  V-valued  functions  on  G.  We 
define  a representation  of  12  on  V to  be  a map  r:  12  -*•  F(G,V)  which 
satisfies  (5) . Such  a representation  allows  a concrete  interpre- 
tation of  patterns  as  suitable  functions  defined  on  the  group. 

We  have  the  following 

Theorem  1.  The  representations  r of  12  on  V correspond  one-to-one 
to  the  measurement  functions  R:  12  •+•  V.  The  correspondence  is 
given  via  r <-*  R if  and  only  if 

wr(x)  = r(x  ^w) , x e g,  wen. 
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Proof:  We  have  already  seen  that  each  measurement  R defines 
a representation.  Now,  let  w -►  w be  a representation  of  ft  on  V 
and  let  us  set  R(w)  = w(lr) , where  1G  is  the  identity  element  of 

We  must  show  that  w1-  as  defined  by  (4)  satisfies  wr  = w.  But  for 
xfeG  we  have  w(x)  = (x-1w) (1  ) = (x-1w) (1  ) = R(x_1w)  = wr(x), 

from  which  w = wr,  as  desired. 

Let  us  point  out  that  an  invariant  measurement  R is  charac- 
terized by  the  condition  that  each  wr  is  a constant  function, 
which  on  the  surface  seems  somewhat  uninteresting.  This  is  a 
deceptive  simplification,  however,  as  will  be  apparent  later. 
Similarly,  if  R is  a relative  invariant  with  modulus  p,  we  see 

that  wr(x)  = p(x  ^)R(w). 

Before  proceeding  to  pursue  the  existence  of  invariants,  it 
seems  appropriate  to  further  accent  the  importance  of  relative 
invariants  by  demonstrating  one  of  their  fundamental  properties. 
Let  R:  ft  -*■  V be  a relative  invariant  with  modulus  p.  Suppose 
that  w^jW^fe  ft  and  that  R(w^)  = R(w2).  Then  for  any  g £ G we  have 

R(gw^)  = p(g)R(w1)  = p(g)R(w2)  = R(gw2)  . Thus, 

R(w^)  = R(w2)  implies  RCgw^  = R(gw2)  . 

It  is  somewhat  interesting  to  note  that  the  condition  above  is 
a complete  characterization  of  relative  invariants  as  is  shown 
in  the  following: 

Theorem  2_.  In  order  that  R:  ft  -*  V be  relatively  invariant  it  is 
necessary  and  sufficient  that 

R(w^)  = R(w2)  implies  R(gw^)  = R(gw2) 
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for  all  g € G. 

Proof:  Necessity  has  already  heen  shown.  Conversely, 

suppose  that  R:  ft  -*■  V satisfies  the  stated  condition.  We  must 
construct  a homomorphism  of  G into  the  group  Sym(V)  of  trans- 
formations on  V.  For  v = R(w)g  v and  g£G,  let  us  define 
p (g) v = R(gw) . We  note  that  this  definition  does  not  depend  on 
w for  if  also  v = R(w')  then  R(gw')  = R(gw) , by  the  property  of 
R.  If  v£  V is  not  of  the  form  v = R(w)  then  set  p(g)v  = v. 

We  easily  verify  that  each  p(g)t  Sym(V) . Also, 

p(g1)p(g2)v  = P (g-jj  P (g2)  R(gw)  ■ p(g1)R(g2w)  = R(g1g2w)  = o(g1g2)v 
in  case  v = R(w)  and  p(g^)p(g2)v  = v = p(g^g2)v  otherwise. 

Thus,  p(g^)p(g2)  = p(g^g2)  so  that  p is  indeed  a homomorphism. 

Finally,  by  definition  of  p(g)  we  see  that  R(gw)  = p(g)R(w) 
for  all  g € G and  w £ Q so  that  R is  a relative  invariant  with  p 
as  modulus. 

Invariants  and  Relative  Invariants 

As  previously  stated,  we  may  impose  additional  structure  by 
invoking  restrictions  on  the  transformation  group.  In  the  fol- 
lowing we  assume  that  V is  a real  vector  space  of  finite  dimension 
and  that  G is  a locally  compact  topological  group.  Such  a group 
admits  a left  invariant  integral,  called  left  Haar  measure  [15] , 
and  the  integration  theory  for  such  groups  is  well  established. 

The  fundamental  technique  for  construction  of  invariants  will 
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be  the  computation  of  average  values  over  the  entire  group  G. 
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This  technique  was  exploited  hy  Pitts  and  McCulloch  118]  in  their 
classic  work  on  the  perception  of  audio  and  visual  forms.  It 
appears  also  in  the  classical  theory  of  group  representation  [21] 
and  is  prevalent  in  modern  analysis  [10,15].  Group  averaging  has 
been  used  as  a tool  in  pattern  recognition  in  a relatively  few 
instances,  implicitly  in  [1.13]  and  explicitly  in  [5,6,7]. 

Now,  let  y denote  the  left  Haar  measure  on  G and  let 
f:  G -*•  V.  We  define  the  mean  value  of  f,  provided  it  exists,  by 


M(f)  - ££  btk r ;Kf  dM' 


(6) 


where  K is  a compact  subset  of  G and  the  limit  is  taken  as  K 

increases.  Note  that  K compact  implies  that  g~^K  is  compact  and 

that  y(K)  = y(g  1K)  . This  together  with  the  fact  that 

/ gfdy  = / i f dy,  g t G,  shows  the  following: 

K g~1K 

Lemma  1.  If  M(f)  exists  then  for  any  g€G,  M(gf)  exists  and 
M ( f ) = M(gf)  . 

We  denote  by  L(V)  , or  simply  L,  the  set  of  all  f : G -*■  V for 
which  M ( f ) exists.  We  have  the  following: 

Lemma  2.  L(V)  is  a linear  space  on  which  G acts  as  the  left  as 
a group  of  linear  transformations . Moreover,  M is  an  invariant 
linear  transformation  of  L(V)  into  V. 

More  generally,  let  p be  a representation  of  G in  the  group 
GL (V)  of  invertible  linear  transformations  on  V.  We  form  the 
weighted  average  of  f:  G -*■  V,  provided  the  limit  exist,  as  follows: 

M (f)  = lira  -4=tt-  / p (x)  f (x)  dy  (x)  , (7) 

p KtG  K 


where  K is  compact,  as  above.  The  set  of  functions  for  which 
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Mp(f)  exists  will  be  denoted  hy  L(V,p),  or  simply  L(p) . Note 
that  by  the  substitution  y = we  obtain 

/Rp (x) gf (x) dy(x)  = /Rp (x) f (g_1x) dx (x)  = / p (gy) f (y) dy (y)  = 

g K 

P (g)  / P (y)  f (y)  dy  (y)  . It  follows  immediately  that 
g K 


Mpfgf)  = p(g)Mp(f),  (8) 

for  all  gtG,  f 6L(V,p).  We  conclude: 

Theorem  _3*  I*(V,p)  is  a linear  space  on  which  G acts  as  a group 

of  linear  transformations.  Also,  is  a linear  mapping  of 
L ( V, p ) into  V which  is  relatively  invariant  with  modulus  p. 

Let  us  define  p':  G ■*  GL(V)  by  p'(x)  =-•  p(x_1)  = (p(x))-1. 

We  have  p'(xy)  = p'(y)p'(x),  so  that  p'  is  a dual  homomorphism. 

For  f 6L(V)  we  may  consider  the  product  p'f  given  by  (p"f) (x)  = 
p'(x)f(x),  x€g.  Since  p(x)p'(x)  = 1 we  see  that  pyf  6 L(V,p) 
whenever  f 6 L(V)  . Similarly,  f€L(V)  implies  pf€L(V,p). 

We  evidently  then  can  use  p and  p'  as  multipliers  to  pass 
back  and  forth  between  L(V)  and  L(V,P).  Thus: 

Lemma  _3:  The  map  f + p'f  is  a linear  isomorphism  from  L(V)  onto 
L(V,p).  Moreover,  M(f)  = M (p'f). 

Although  this  shows  that  the  linear  structure  of  L(V)  and 
L (V, p ) are  no  different,  it  is  important  to  observe  that  they  are 
quite  different  with  respect  to  the  action  of  G. 

We  may  now  state  sufficient  conditions  for  the  existence  of 
invariants  and/or  relative  invariants  for  the  pattern  space  Q. 
Quite  simply,  if  R:  + V is  such  that  each  wr£L(V,p)  then 
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we  obtain  a relative  invariant  R hy  defining 

R(w)  = Mp (wr)  (9) 

We  see  that  R(gw)  = Mp((gw)r)  = Mp(g  wr)  = o(g)  Mp(wr)  = p(g)R(w), 

as  desired.  We  obtain  the  corresponding  result  for  invariants 
in  the  special  case  in  which  p is  the  trivial  representation, 

p (g)  5 iv- 

Recall  that  if  R:  0 -*■  V is  relatively  invariant  with  modulus 
p,  then  we  may  write  vr^x)  = p(x_1)R(w)  for  all  w£fl,  xfcG.  Thus, 
we  have  for  each  compact  subset  K of  G,  /Kp(x)wr(x)  dp(x)  = 

/KR(w)dy(x)  = p.(K)  R(w)  . Comparison  with  (7)  shows  that  (wr) 

exists  and  is  equal  to  R(w) . This  shows  that  each  wr£L(V,p) 
as  well  as  the  identity  Mp(wr)  = R(w) . We  have  thus  shown: 

Theorem  £.  If  R:  (!  + V is  such  that  each  wr  6 L(V,p)  , then 

R(w)  = Mp (wr)  defines  a relative  invariant  R with  modulus  p. 
Conversely,  every  relative  invariant  is  precisely  of  this  form, 
since  if  R is  relatively  invariant  with  modulus  p,  then  each 

wr£  L ( V,  p ) and  R = R. 

The  above  may  be  paraphrased  by  saying  that  the  construction 
of  relative  invariants  with  a given  modulus  p is  equivalent  to 
the  construction  of  a representation  w + wr  of  (!  on  V such  that 
each  w € L(V,p).  Observe  that,  in  particular,  we  have  shown 
in  a strict  sense  that  every  relative  invariant  is  a weighted 
average  over  the  entire  group  G. 
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The  result  in  Theorem  4 gives  valuable  insight  to  the  nature 
of  invariants  and  relative  invariants.  Nevertheless,  it  is  less 
than  satisfying  in  certain  ways.  In  the  first  place,  it  gives  no 
clue  as  to  how  to  construct  a suitable  R,  although  it  can  certainly 
eliminate  a number  of  choices.  Consequently,  it  is  not  a true 
existence  theorem  in  the  sense  that  for  a given  application  it 
does  not  actually  produce  an  invariant.  Moreover,  there  are  many 
examples  of  invariants  which  occur  in  natural  ways  but  are  not 
presented  in  the  form  given  above  (although  they  are  necessarily 
equivalent  to  such  a form) . 

Existence  of  Invariants 

The  consideration  of  the  group  average  in  the  proceeding 
section  led  to  the  existence  of  relative  invariants  and  is  appli- 
cable in  any  situation  where  each  wr  belongs  to  a class  of  func- 
tions for  which  such  an  average  exists.  This  is  the  case,  for 
example,  when  the  class  of  functions  is  almost  periodic  (in  the 
sense  of  J.  von  Neumann  [10].  It  is,  however,  applicable  in  a 
wider  variety  of  cases,  namely  those  in  which  set  B(G)  of  bounded 
real  valued  functions  admits  an  invariant  mean,  in  the  following 
sense : 

Definition;  An  invariant  mean  M on  the  class  B(G)  of  bounded  real 
valued  functions  on  G is  a real  linear  functional  M on  B(G)  which 
is  invariant  under  the  action  of  G on  B(G)  and  satisfies 


inf  f < M(f)  < sup  f , f € B (G) . 


(10) 
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Let  V denote  the  dual  space  of  V and  for  each  f (,  B(G,V), 

It  It 

the  hounded  functions  from  G to  V,  and  for  each  v e V we  ob- 

* 

serve  that  v o f£B(G). 

Lemma  £.  Let  Mq  £ (B(G))  , the  dual  of  B(G).  There  exists  a 

unique  M € Lin (B (G,V) , V)  such  that 

v*  o M = Mq  o v*  (11) 

for  all  v*  € V*. 

Proof:  It  is  clear  that  any  M satisfying  (11)  is  unique. 

Let  v^ , v 2 ,vn  be  a basis  m V and  v1^v2'---'vn  a dual  basis 

* * 

mV,  so  that  = S^.  Let  us  define  M:  B(G,V)  - V by 


M(f) 


n * 

- Mf\  ( v . of ) v . 
i=l  u 1 1 


f € 3 (G , V)  . 


(12) 


, * * * * 

Then  for  any  v fe  V we  have  v o M (f)  = <v  ,M(f) > = 

^ * * n * * * 

£ < v # v • > Mn(v. o f)  = M„(  £ < v , v . > v • o f)  = Mn (v  o f) . That 
i=l  101  ° i=l  11  0 

★ * 

v o M = MgO  V , as  desired. 

Let  us  observe  that  for  any  A€Lin(V,V)  and  f€B(G,V)  we  may 
define  the  composite  Af  so  that  B(G,V)  may  be  considered  as  a 
(left)  module  over  Lin(V,V).  With  this  in  mind,  we  observe: 

Lemma  5:  The  linear  map  M;  B(G,V)+  V defined  by  (11)  above  is  a 

morphism  of  B(G,V)  to  V considered  as  modules  over  Lin(V,V). 

_ * * 

Proof:  For  any  A€Lin(V,V),  v e V and  f€B(G,V),  we  have 

v*o  M (Af ) = Mq(v*o  Af)  = Mq ( (A  v ) of ) = A V o M(f)  = v o AM ( f ) . 
Hence,  M (Af ) = AM(f) , completing  the  proof. 
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Lemma  £:  If  Mg  £ (B(G))  is  invariant  under  G then  so  is  the  map 
M as  defined  by  (11) . 

Proof : Mg  invariant  means  that  for  any  f Q £ B (G)  and  g€G, 

we  have  Mg(gfQ)  = MQ(fg).  Thus,  if  v 6 V , f tB(G,V)  then  for 

* ★ * 

any  g € G we  have  voM(gf)  = Mg (v  o f ) =Mg(g(vof))  = 

* * 

Mg (v  o f)  = v o M (f ) . Then,  M(gf)  = M(f) , as  desired. 

We  have  thus  shown  how  to  "lift"  invariant  linear  functionals 
on  B (G)  to  invariant  linear  maps  from  B(G,V)  to  V. 

Corollary:  If  G admits  an  invariant  mean  Mg  and  R:  ft  V 

is  such  that  each  wr€  B(G,V),  then 

R(w)  = M(wr)  , (12) 

where  M is  given  by  (11) , is  an  invariant  measurement. 

Proof : R(gw)  = M((gw)r)  = M(gwr)  = M(wr)  = R(w). 

We  may  obtain  relative  invariants  in  a similar  fashion. 
However,  to  remain  within  the  bounded  functions,  we  restrict  our 
attention  to  unitary  representations  of  G. 

Let  us  suppose  that  Mg  is  a given  invariant  mean  on  the 

class  of  bounded  function  on  G and  that  M is  the  lifted  map  de- 
fined by  (11)  above.  Also,  let  p be  a given  unitary  representa- 
tion of  G in  GL (V) . Observe  then  that  for  each  f 6 B(G,V)  we 
have  also  pf€B(G,V),  where  (pf)  (g)  = p(g)f(g),  g € G.  Now,  let 
R:  ft  -*■  V be  a given  measurement  function  such  that  each  wr£  B(G,V)  . 


Since  this  simply  means  that  the  values  of  R on  the  orbit  of  w 
are  bounded,  this  is  not  deemed  to  be  a serious  restriction. 


With,  this  in  mind,  let  us  note  that  p(gw)r  = p(g)  g(pwr) 
for  all  g€  G,  w€ft.  To  see  this,  we  have,  at  any  x€  G, 

[p(gw)r](x)  = p (x) (gwr) (x)  = p(x)  wr(g_1x)  = p(g)  p (g-1x) wr (g-1x) = 
p(g)[g(pw  ) ] (x) . Also,  let  us  observe  that,  for  fixed  g, 
p(g)£  Lin(V,V)  and  that  M is  a morphism  of  Lin (V,V) -modules . We 
now  define  R:  Q ■+  V by  the  formula 

R(w)  = M ( pwr ) , w€  fi.  (13) 

Recalling  the  invariance  of  M,  and  the  facts  above,  we  see  that 

for  g£G,  we  have  R(gw)  = M(p(gw)r)  = M(  p (g)  g ( pwr ) ) 

= p(g)  M (g ( pwr) ) = p (g) M ( pwr ) = p(g)R(w). 

That  is,  R is  a relative  invariant  and  has  the  given  representation 
p as  its  modulus.  We  have  therefore  proved  the  following  remark- 
able result: 

Theorem  5_.  If  B(G)  admits  an  invariant  mean  Mg , p is  any  unitary 
representation  of  G in  GL(V),  and  a non-trivial  bounded  measure- 
ment function  R:  fi  ->-  V exists,  then  there  exists  a non-trivial 
relative  invariant  R:  0 V with  modulus  p.  R is  given  explicitly 
by 

R(w)  = M(pwr)  , (13) 

where  M is  the  lift  of  MQ  to  B(G,V) . 

Note:  The  appearance  of  the  words  non-trivial  in  the  above 

requires  slight  explanation.  We  can  clearly  define  M:  B(G,V)  -*■  V 
by  M ( f ) = M (p f ) and  deduce  that  M(gf)  = p(g)M(f).  The  fact  that 

M £ 0 gives  M ^ 0.  Since  R(w)  = M(wr),  we  see  the  sense  in  which 
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R is  non-trivial,  i.e.,  it  is  the  restriction  of  M to  the  func- 
tions =»  {wr|w€  n}.  Nevertheless,  it  could  happen  that  each 

17  — 

pw  £ kerM  so  that  R = 0 even  though  R % 0.  This  is  unlikely  and 
can  be  ignored  if  we  have  some  pwr  > 0.  For  such  w SI  we  see 
that  R(w)  > 0. 

Summary 

We  have  shown  that  every  set  of  patterns  subject  to  a trans- 
formation group  is  representable  as  functions  defined  on  the  group 
and  that  such  representations  are  implicit  in  the  measurement 
process.  It  has  also  been  shown  that  every  relative  invariant 
is  equivalent  to  a weighted  average  over  the  group  of  a measure- 
ment on  the  patterns.  Moreover,  the  existence  of  suitable  many 
relative  invariants  have  been  demonstrated  in  any  situation  in 
which  measurements  are  bounded  and  the  group  admits  an  invariant 
mean . 
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